Add comprehensive model card for Conan: Progressive Learning to Reason Like a Detective over Multi-Scale Visual Evidence
#1
by nielsr HF Staff - opened
This PR significantly enhances the model card for the Conan model by adding essential information for improved discoverability and usability on the Hugging Face Hub.
Key additions include:
- Metadata:
pipeline_tag: video-text-to-textto enable the video reasoning inference widget,library_name: transformersfor automatic code snippet generation (when an explicit usage example is provided/discovered later),base_model: Qwen2.5-VL-7B-Instruct,datasets: Conan-91K, and descriptivetags(multimodal, video, reasoning, qwen). - Paper Link: A direct link to the research paper Conan: Progressive Learning to Reason Like a Detective over Multi-Scale Visual Evidence.
- Abstract: The full abstract from the paper for a detailed overview of the model's approach and findings.
- Project Description: A summary of Conan's capabilities, derived from the paper's introduction.
- Teaser Image: An image to visually represent the model's performance from the GitHub repository.
- GitHub Repository Link: A link to the official GitHub repository for more detailed code and resources.
- Evaluation Details: Information on the Conan-Eval toolkit and supported benchmarks, extracted from the GitHub README.
- Citation: A BibTeX entry for easy referencing.
This update makes the model card more informative and user-friendly.