Add robotics metadata, paper links, and usage information
Browse filesHi! I'm Niels from the Hugging Face community science team.
I've opened this PR to improve the discoverability and documentation of LingBot-VLA.
- Added the `robotics` pipeline tag to help users find the model in the Robotics section of the Hub.
- Added `library_name: lerobot` as the project utilizes the LeRobot library.
- Added links to the research paper, project page, and GitHub repository.
- Included installation instructions and a sample usage snippet for evaluation based on your GitHub README.
README.md
CHANGED
|
@@ -1,29 +1,67 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
# A Pragmatic VLA Foundation Model
|
|
|
|
| 2 |
<p align="center">
|
| 3 |
-
<img src="assets/Teaser.png" width="100%">
|
| 4 |
</p>
|
| 5 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 6 |
|
| 7 |
-
|
| 8 |
-
|
| 9 |
-
|
| 10 |
-
-
|
| 11 |
-
- **Training Efficiency**: Represent a 1.5 ∼ 2.8× (depending on the relied VLM base model) speedup over existing VLA-oriented codebases.
|
| 12 |
|
| 13 |
---
|
| 14 |
|
| 15 |
-
##
|
| 16 |
-
- Repository: https://github.com/robbyant/lingbot-vla
|
| 17 |
-
- Paper: A Pragmatic VLA Foundation Model
|
| 18 |
-
- Project Page: https://technology.robbyant.com/lingbot-vla
|
| 19 |
|
| 20 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 21 |
|
| 22 |
-
|
| 23 |
-
| :--- | :---: | :---: | :---: |
|
| 24 |
-
| LingBot-VLA-4B | [🤗 lingbot-vla-4b](https://huggingface.co/robbyant/lingbot-vla-4b) | [🤖 lingbot-vla-4b](https://modelscope.cn/models/Robbyant/lingbot-vla-4b) | LingBot-VLA *w/o* Depth|
|
| 25 |
-
| LingBot-VLA-4B-Depth | [🤗 lingbot-vla-4b-depth](https://huggingface.co/robbyant/lingbot-vla-4b-depth) | [🤖 lingbot-vla-4b-depth](https://modelscope.cn/models/Robbyant/lingbot-vla-4b-depth) | LingBot-VLA *w/* Depth |
|
| 26 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 27 |
|
| 28 |
---
|
| 29 |
|
|
@@ -37,10 +75,5 @@ data from 9 popular dual-arm robot configurations.
|
|
| 37 |
}
|
| 38 |
```
|
| 39 |
|
| 40 |
-
---
|
| 41 |
-
|
| 42 |
-
## License Agreement
|
| 43 |
-
This project is licensed under the [Apache-2.0 License](LICENSE).
|
| 44 |
-
|
| 45 |
## Acknowledgement
|
| 46 |
-
This codebase is
|
|
|
|
| 1 |
+
---
|
| 2 |
+
license: apache-2.0
|
| 3 |
+
pipeline_tag: robotics
|
| 4 |
+
library_name: lerobot
|
| 5 |
+
tags:
|
| 6 |
+
- vision-language-action
|
| 7 |
+
- vla
|
| 8 |
+
- robot-learning
|
| 9 |
+
---
|
| 10 |
+
|
| 11 |
# A Pragmatic VLA Foundation Model
|
| 12 |
+
|
| 13 |
<p align="center">
|
| 14 |
+
<img src="https://github.com/robbyant/lingbot-vla/raw/main/assets/Teaser.png" width="100%">
|
| 15 |
</p>
|
| 16 |
|
| 17 |
+
**LingBot-VLA** is a pragmatic Vision-Language-Action (VLA) foundation model designed for robotic manipulation. It is pre-trained on 20,000 hours of real-world data from 9 popular dual-arm robot configurations, achieving strong generalization across various robotic platforms.
|
| 18 |
+
|
| 19 |
+
- **Paper:** [A Pragmatic VLA Foundation Model](https://huggingface.co/papers/2601.18692)
|
| 20 |
+
- **Repository:** [https://github.com/robbyant/lingbot-vla](https://github.com/robbyant/lingbot-vla)
|
| 21 |
+
- **Project Page:** [https://technology.robbyant.com/lingbot-vla](https://technology.robbyant.com/lingbot-vla)
|
| 22 |
+
|
| 23 |
+
## Key Features
|
| 24 |
+
- **Large-scale Pre-training**: Trained on 20,000 hours of real-world data.
|
| 25 |
+
- **Strong Performance**: Achieves state-of-the-art results on simulation and real-world benchmarks (GM-100 and RoboTwin 2.0).
|
| 26 |
+
- **Efficiency**: Offers a 1.5 ~ 2.8× speedup in training throughput compared to existing VLA frameworks.
|
| 27 |
+
|
| 28 |
+
## Related Models
|
| 29 |
|
| 30 |
+
| Model Name | Huggingface | Description |
|
| 31 |
+
| :--- | :---: | :---: |
|
| 32 |
+
| LingBot-VLA-4B | [🤗 lingbot-vla-4b](https://huggingface.co/robbyant/lingbot-vla-4b) | LingBot-VLA *w/o* Depth|
|
| 33 |
+
| LingBot-VLA-4B-Depth | [🤗 lingbot-vla-4b-depth](https://huggingface.co/robbyant/lingbot-vla-4b-depth) | LingBot-VLA *w/* Depth |
|
|
|
|
| 34 |
|
| 35 |
---
|
| 36 |
|
| 37 |
+
## Installation
|
|
|
|
|
|
|
|
|
|
| 38 |
|
| 39 |
+
```bash
|
| 40 |
+
# Install LeRobot
|
| 41 |
+
pip install torch==2.8.0 torchvision==0.23.0 torchaudio==2.8.0 --index-url https://download.pytorch.org/whl/cu128
|
| 42 |
+
git clone https://github.com/huggingface/lerobot.git
|
| 43 |
+
cd lerobot
|
| 44 |
+
git checkout 0cf864870cf29f4738d3ade893e6fd13fbd7cdb5
|
| 45 |
+
pip install -e .
|
| 46 |
+
|
| 47 |
+
# Clone the LingBot-VLA repository
|
| 48 |
+
git clone https://github.com/robbyant/lingbot-vla.git
|
| 49 |
+
cd lingbot-vla/
|
| 50 |
+
pip install -e .
|
| 51 |
+
pip install -r requirements.txt
|
| 52 |
+
```
|
| 53 |
+
|
| 54 |
+
## Sample Usage
|
| 55 |
|
| 56 |
+
To evaluate the policy on the RoboTwin 2.0 benchmark:
|
|
|
|
|
|
|
|
|
|
| 57 |
|
| 58 |
+
```bash
|
| 59 |
+
export QWEN25_PATH=path_to_Qwen2.5-VL-3B-Instruct
|
| 60 |
+
python -m deploy.lingbot_robotwin_policy \
|
| 61 |
+
--model_path robbyant/lingbot-vla-4b \
|
| 62 |
+
--use_length 50 \
|
| 63 |
+
--port 8080
|
| 64 |
+
```
|
| 65 |
|
| 66 |
---
|
| 67 |
|
|
|
|
| 75 |
}
|
| 76 |
```
|
| 77 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 78 |
## Acknowledgement
|
| 79 |
+
This codebase is built on the [VeOmni](https://arxiv.org/abs/2508.02317) project and benefits from [LeRobot](https://github.com/huggingface/lerobot).
|