Image-Text-to-Text
Transformers
Safetensors
infinite_vl
feature-extraction
vision-language-model
linear-attention
gated-deltanet
infinitevl
multimodal
conversational
custom_code
Instructions to use hustvl/InfiniteVL-LongSFT with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use hustvl/InfiniteVL-LongSFT with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("image-text-to-text", model="hustvl/InfiniteVL-LongSFT", trust_remote_code=True) messages = [ { "role": "user", "content": [ {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"}, {"type": "text", "text": "What animal is on the candy?"} ] }, ] pipe(text=messages)# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("hustvl/InfiniteVL-LongSFT", trust_remote_code=True, dtype="auto") - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use hustvl/InfiniteVL-LongSFT with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "hustvl/InfiniteVL-LongSFT" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "hustvl/InfiniteVL-LongSFT", "messages": [ { "role": "user", "content": [ { "type": "text", "text": "Describe this image in one sentence." }, { "type": "image_url", "image_url": { "url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg" } } ] } ] }'Use Docker
docker model run hf.co/hustvl/InfiniteVL-LongSFT
- SGLang
How to use hustvl/InfiniteVL-LongSFT with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "hustvl/InfiniteVL-LongSFT" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "hustvl/InfiniteVL-LongSFT", "messages": [ { "role": "user", "content": [ { "type": "text", "text": "Describe this image in one sentence." }, { "type": "image_url", "image_url": { "url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg" } } ] } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "hustvl/InfiniteVL-LongSFT" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "hustvl/InfiniteVL-LongSFT", "messages": [ { "role": "user", "content": [ { "type": "text", "text": "Describe this image in one sentence." }, { "type": "image_url", "image_url": { "url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg" } } ] } ] }' - Docker Model Runner
How to use hustvl/InfiniteVL-LongSFT with Docker Model Runner:
docker model run hf.co/hustvl/InfiniteVL-LongSFT
Improve model card: Update arXiv link and add comprehensive details from GitHub
#1
by nielsr HF Staff - opened
Hello team,
I've opened this PR to enhance the model card for the InfiniteVL model. The updates aim to provide more comprehensive and accurate information for users.
Key changes include:
- Corrected arXiv paper link: The previous placeholder
https://arxiv.org/abs/2502.xxxxxhas been updated to the correct linkhttps://arxiv.org/abs/2512.08829, consistent with the paper info and GitHub repository. - Added Hugging Face badge: A Hugging Face badge has been included in the header to improve navigation and visibility of the model on the Hub.
- Enriched content from GitHub README: Several crucial sections from the project's GitHub README have been integrated into the model card, including:
NewsTable of Contents(updated to reflect the model card structure)Architecture(including relevant images with absolute GitHub URLs)Training Strategy(including relevant images with absolute GitHub URLs)Performance(including relevant images with absolute GitHub URLs)Qualitative Analysis & Visualization(including relevant images with absolute GitHub URLs)Contact
- Detailed Advanced Usage: The brief "Advanced Usage (Cuda Graph)" section has been replaced with the more detailed explanation and code snippets from the GitHub README.
- Refined Metadata Tags: Removed the redundant
image-text-to-texttag from thetagslist, as it is already present inpipeline_tag.
These changes ensure the model card is more informative, accurate, and easier to navigate for anyone exploring InfiniteVL.
All relative image paths from the GitHub README have been converted to absolute https://github.com/hustvl/InfiniteVL/raw/main/assets/... URLs.
HongyuanTao changed pull request status to merged
Thank you for the improvements! I really appreciate the help with the model card.