Image-to-Text
Transformers
Safetensors
English
qwen2_vl
image-text-to-text
vision-language-model
document-understanding
handwritten-text
insurance-forms
vqa
phi-3.5-vision
lora
qlora
unsloth
medical-forms
ocr-free
Eval Results (legacy)
text-generation-inference
Instructions to use solvrays/mdf-form-reader-phi35-vision with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use solvrays/mdf-form-reader-phi35-vision with Transformers:
# Use a pipeline as a high-level helper # Warning: Pipeline type "image-to-text" is no longer supported in transformers v5. # You must load the model directly (see below) or downgrade to v4.x with: # 'pip install "transformers<5.0.0' from transformers import pipeline pipe = pipeline("image-to-text", model="solvrays/mdf-form-reader-phi35-vision")# Load model directly from transformers import AutoProcessor, AutoModelForImageTextToText processor = AutoProcessor.from_pretrained("solvrays/mdf-form-reader-phi35-vision") model = AutoModelForImageTextToText.from_pretrained("solvrays/mdf-form-reader-phi35-vision") - Notebooks
- Google Colab
- Kaggle
- Local Apps
- Unsloth Studio new
How to use solvrays/mdf-form-reader-phi35-vision with Unsloth Studio:
Install Unsloth Studio (macOS, Linux, WSL)
curl -fsSL https://unsloth.ai/install.sh | sh # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for solvrays/mdf-form-reader-phi35-vision to start chatting
Install Unsloth Studio (Windows)
irm https://unsloth.ai/install.ps1 | iex # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for solvrays/mdf-form-reader-phi35-vision to start chatting
Using HuggingFace Spaces for Unsloth
# No setup required # Open https://huggingface.co/spaces/unsloth/studio in your browser # Search for solvrays/mdf-form-reader-phi35-vision to start chatting
Load model with FastModel
pip install unsloth from unsloth import FastModel model, tokenizer = FastModel.from_pretrained( model_name="solvrays/mdf-form-reader-phi35-vision", max_seq_length=2048, )
| {% set image_count = namespace(value=0) %}{% set video_count = namespace(value=0) %}{% for message in messages %}{% if loop.first and message['role'] != 'system' %}<|im_start|>system | |
| You are a helpful assistant.<|im_end|> | |
| {% endif %}<|im_start|>{{ message['role'] }} | |
| {% if message['content'] is string %}{{ message['content'] }}<|im_end|> | |
| {% else %}{% for content in message['content'] %}{% if content['type'] == 'image' or 'image' in content or 'image_url' in content %}{% set image_count.value = image_count.value + 1 %}{% if add_vision_id %}Picture {{ image_count.value }}: {% endif %}<|vision_start|><|image_pad|><|vision_end|>{% elif content['type'] == 'video' or 'video' in content %}{% set video_count.value = video_count.value + 1 %}{% if add_vision_id %}Video {{ video_count.value }}: {% endif %}<|vision_start|><|video_pad|><|vision_end|>{% elif 'text' in content %}{{ content['text'] }}{% endif %}{% endfor %}<|im_end|> | |
| {% endif %}{% endfor %}{% if add_generation_prompt %}<|im_start|>assistant | |
| {% endif %} |