Improve model card: Add pipeline tag, library name, full content, and sample usage

by nielsr HF Staff - opened Oct 15, 2025

←

This PR significantly enhances the model card for VideoRFT, aligning it with Hugging Face best practices.

Key changes include:

Populated Content: Added a comprehensive description, abstract, methodology details, dataset information, setup instructions, training guidance, evaluation procedure, and a ready-to-use inference code snippet, all extracted from the paper and GitHub repository.
Metadata Updates:\n - Changed pipeline_tag from visual-question-answering to video-text-to-text to better reflect the model's general video reasoning capabilities.
- Added library_name: transformers as the model is compatible with the 🤗 Transformers library, enabling the automated "Use in Transformers" widget.
Links: Ensured proper linking to the Hugging Face paper page (VideoRFT: Incentivizing Video Reasoning Capability in MLLMs via Reinforced Fine-Tuning) and the GitHub repository (https://github.com/QiWang98/VideoRFT).

These updates will make the model more discoverable and easier to use for the community.

QiWang98 changed pull request status to merged Oct 21, 2025

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment