Improve model card: Add metadata, links, overview, and citation
#1
by
nielsr
HF Staff
- opened
This PR enhances the model card by adding key metadata and comprehensive information:
- Adds
pipeline_tag: image-text-to-textto correctly categorize the model for multimodal tasks. - Adds
library_name: transformersas the model architecture (llava_llamaandAnchorLlava) andtransformers_versioninconfig.jsonindicate compatibility with thetransformerslibrary, enabling the "How to use" widget. - Includes a direct link to the paper: Chain-of-Visual-Thought: Teaching VLMs to See and Think Better with Continuous Visual Tokens.
- Provides links to the official project page (https://wakalsprojectpage.github.io/comt-website) and the GitHub repository (https://github.com/Wakals/CoMT) for easy access to more resources.
- Expands the "Model Description" with a detailed overview of CoVT's methodology and benefits, derived from the paper's abstract and the GitHub README.
- Embeds relevant demo images from the GitHub repository to visually illustrate the model's capabilities.
- Adds a BibTeX citation for the paper.
Please review and merge if these improvements align with your expectations.