Instructions to use adamchanadam/Test_Florence-2-FT-DocVQA with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use adamchanadam/Test_Florence-2-FT-DocVQA with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("image-text-to-text", model="adamchanadam/Test_Florence-2-FT-DocVQA", trust_remote_code=True)

# Load model directly
from transformers import AutoProcessor, AutoModelForMultimodalLM

processor = AutoProcessor.from_pretrained("adamchanadam/Test_Florence-2-FT-DocVQA", trust_remote_code=True)
model = AutoModelForMultimodalLM.from_pretrained("adamchanadam/Test_Florence-2-FT-DocVQA", trust_remote_code=True)

Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use adamchanadam/Test_Florence-2-FT-DocVQA with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "adamchanadam/Test_Florence-2-FT-DocVQA"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "adamchanadam/Test_Florence-2-FT-DocVQA",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker

docker model run hf.co/adamchanadam/Test_Florence-2-FT-DocVQA

SGLang

How to use adamchanadam/Test_Florence-2-FT-DocVQA with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "adamchanadam/Test_Florence-2-FT-DocVQA" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "adamchanadam/Test_Florence-2-FT-DocVQA",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "adamchanadam/Test_Florence-2-FT-DocVQA" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "adamchanadam/Test_Florence-2-FT-DocVQA",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Docker Model Runner
How to use adamchanadam/Test_Florence-2-FT-DocVQA with Docker Model Runner:
```
docker model run hf.co/adamchanadam/Test_Florence-2-FT-DocVQA
```

adamchanadam/Test_Florence-2-FT-DocVQA

This model is fine-tuned from microsoft/Florence-2-base-ft for Document Visual Question Answering (DocVQA) tasks.

Model description

Fine-tuned for answering questions about images, specifically focused on logo recognition and company information.
The model uses the <DocVQA> prompt to indicate the task type.
Number of unique images: 28
Number of epochs: 7
Learning rate: 1e-06
Optimizer: AdamW
Early stopping: Patience of 2 epochs, delta of 0.0001

Dataset statistics: Total number of questions for fine-tuning: 560. logo_recognition: 49 (8.75%) brand_identification: 48 (8.57%) visual_elements: 65 (11.61%) text_in_logo: 57 (10.18%) industry_classification: 49 (8.75%) product_service: 55 (9.82%) company_details: 89 (15.89%) negative_sample: 148 (26.43%)

Intended use & limitations

Use for answering questions about logos and company information in images
Performance may be limited for questions or image content not represented in the training data

Training procedure

Images were resized and normalized according to Florence-2's preprocessing standards.
The <DocVQA> prompt was used during fine-tuning to indicate the task type.
Questions and answers were provided for each image in the training set.
Batch size: 4
Evaluation metric: Cross-entropy loss on a held-out validation set

For more information, please contact the model creators.

Downloads last month: 12

Safetensors

Model size

0.3B params

Tensor type

F32

Inference Providers NEW

Image-Text-to-Text

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for adamchanadam/Test_Florence-2-FT-DocVQA

Base model

microsoft/Florence-2-base-ft

Finetuned

(23)

this model