Spaces:
Sleeping
Sleeping
metadata
title: img3txt
emoji: 📷
colorFrom: blue
colorTo: purple
sdk: docker
app_port: 7860
pinned: false
Image captioning API using microsoft/Florence-2-base with a Python FastAPI backend. Open /docs for Swagger UI.
Speed tuning env vars: DEFAULT_MAX_TOKENS (default 64), MAX_IMAGE_SIDE (default 896), MAX_MAX_TOKENS (default 256), MODEL_ID (default microsoft/Florence-2-base), MODEL_REVISION (pin to a commit SHA, e.g. 5ca5edf5bd017b9919c05d08aebef5e4c7ac3bac).
POST /predict form field text is the full Florence-2 task prompt. For standard captioning use <CAPTION> only (or omit text to use the default). Do not append extra words to <CAPTION>.