Improve model card: Add pipeline tag, library, HF paper link, abstract, and usage examples

by nielsr HF Staff - opened Oct 16, 2025

←

This PR significantly enhances the model card for InternVLA-M1_spatial by:

Adding pipeline_tag: robotics to improve discoverability on the Hugging Face Hub.
Specifying library_name: transformers, as the model's codebase is built upon both Transformers and Diffusers, with its VLM component (Qwen2.5-VL) being a Transformers model.
Integrating the official Hugging Face paper link: InternVLA-M1: A Spatially Guided Vision-Language-Action Framework for Generalist Robot Policy.
Including the paper's abstract for a quick overview of the model's capabilities and methodology.
Adding a comprehensive "Quick Interactive M1 Demo" section with Python code snippets for both chat/spatial grounding and action prediction, directly from the GitHub repository, enabling users to quickly get started.
Incorporating additional detailed sections from the GitHub README such as Key Features, Target Audience, Experimental Results, Model Zoo, Roadmap, Contributing, Contact, and Acknowledgements, providing a more complete resource.
Updating the citation with a more complete BibTeX entry.

These updates aim to make the model card more informative, user-friendly, and discoverable.

Please review and merge this PR.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

Ready to merge

This branch is ready to get merged automatically.

· Sign up or log in to comment