Robotics
Transformers
Safetensors
paligemma
image-text-to-text
vision-language-action
chain-of-thought
embodied-ai
text-generation-inference
Instructions to use yinchenghust/deepthinkvla_base with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use yinchenghust/deepthinkvla_base with Transformers:
# Load model directly from transformers import AutoProcessor, AutoModelForImageTextToText processor = AutoProcessor.from_pretrained("yinchenghust/deepthinkvla_base") model = AutoModelForImageTextToText.from_pretrained("yinchenghust/deepthinkvla_base") - Notebooks
- Google Colab
- Kaggle
Add model card and metadata
#1
by nielsr HF Staff - opened
Hi! I'm Niels, part of the community science team at Hugging Face. I noticed this repository was missing a model card. This PR adds a README with:
- Metadata for the
roboticspipeline tag andtransformerslibrary name. - Links to the research paper DeepThinkVLA: Enhancing Reasoning Capability of Vision-Language-Action Models.
- A link to the official GitHub repository.
- A summary of the model's architecture and performance results on benchmarks like LIBERO.
This will help users find and understand your work more easily on the Hugging Face Hub!
yinchenghust changed pull request status to merged
yinchenghust deleted the
refs/pr/1 ref