Video-Text-to-Text
Transformers
Safetensors
English
qwen2_5_vl
image-text-to-text
text-generation-inference
Instructions to use Video-R1/Qwen2.5-VL-7B-COT-SFT with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use Video-R1/Qwen2.5-VL-7B-COT-SFT with Transformers:
# Load model directly from transformers import AutoProcessor, AutoModelForImageTextToText processor = AutoProcessor.from_pretrained("Video-R1/Qwen2.5-VL-7B-COT-SFT") model = AutoModelForImageTextToText.from_pretrained("Video-R1/Qwen2.5-VL-7B-COT-SFT") - Notebooks
- Google Colab
- Kaggle
Improve model card: Add pipeline tag, library name, paper link, authors, and sample usage
#2
by nielsr HF Staff - opened
This PR significantly enhances the model card for Video-R1/Qwen2.5-VL-7B-COT-SFT by:
- Adding the
pipeline_tag: video-text-to-textto ensure proper categorization and discoverability on the Hugging Face Hub. - Specifying
library_name: transformers, enabling direct integration and a "how to use" widget for the model with the π€ Transformers library, supported byconfig.jsonevidence. - Adding an explicit link to the official paper on Hugging Face Papers.
- Including the list of authors for proper attribution.
- Expanding the model description with an "About" section based on the paper's abstract.
- Providing a clear
transformers-based code snippet for sample inference.
These improvements will make the model more accessible, informative, and user-friendly for the community.
KaituoFeng changed pull request status to merged