Spaces:

d3evil4
/

Image2Caption

Sleeping

File size: 687 Bytes

ba547b6
a022cd7
 
 
 
ba547b6
a022cd7
ba547b6
 
 
641b32e
b02d5c5
da2a069
49f8ccd

---
title: img3txt
emoji: 📷
colorFrom: blue
colorTo: purple
sdk: docker
app_port: 7860
pinned: false
---

Image captioning API using `microsoft/Florence-2-base` with a Python FastAPI backend. Open `/docs` for Swagger UI.

Speed tuning env vars: `DEFAULT_MAX_TOKENS` (default `64`), `MAX_IMAGE_SIDE` (default `896`), `MAX_MAX_TOKENS` (default `256`), `MODEL_ID` (default `microsoft/Florence-2-base`), `MODEL_REVISION` (pin to a commit SHA, e.g. `5ca5edf5bd017b9919c05d08aebef5e4c7ac3bac`).

`POST /predict` form field `text` is the full Florence-2 task prompt. For standard captioning use `<CAPTION>` only (or omit `text` to use the default). Do not append extra words to `<CAPTION>`.