Improve model card: Add pipeline tag, library name, paper, code, abstract, image, and usage
#1
by
nielsr HF Staff - opened
This PR significantly enhances the model card for the Vision-Zero-InternVL3-14B-Clevr model by adding crucial metadata and detailed documentation.
Specifically, it includes:
pipeline_tag: image-text-to-text: This accurately categorizes the model's functionality as a Vision-Language Model, improving its discoverability on the Hugging Face Hub.library_name: transformers: Evidence from theconfig.json(e.g.,transformers_version,architectures) suggests compatibility with thetransformerslibrary, enabling automated code snippets for users.license: cc-by-nc-4.0: A common research license has been added.- Paper Link: A direct link to the paper Vision-Zero: Scalable VLM Self-Improvement via Strategic Gamified Self-Play.
- GitHub Repository: A link to the official GitHub repository: https://github.com/wangqinsi1/Vision-Zero.
- Abstract: The full paper abstract is included for a comprehensive overview.
- Overview Image: The main overview image from the GitHub README is included for visual context.
- Quick Start (Inference): A detailed usage section, including setup instructions and a Python code snippet, is directly extracted from the GitHub README to guide users on how to run inference.
- Citation: The BibTeX citation for the paper is also included.
These additions will greatly improve the discoverability, usability, and overall documentation of the model on the Hugging Face Hub.