nielsr HF Staff

Add metadata and link paper/code

de931d9 verified 3 months ago

2.51 kB

license: apache-2.0
pipeline_tag: robotics

A Pragmatic VLA Foundation Model

LingBot-VLA is a Vision-Language-Action (VLA) foundation model designed for robotic manipulation, emphasizing pragmatic deployment, efficiency, and strong generalization across tasks and platforms.

Paper: A Pragmatic VLA Foundation Model
Repository: https://github.com/robbyant/lingbot-vla
Project Page: https://technology.robbyant.com/lingbot-vla

Highlights

Large-scale Pre-training Data: Trained on 20,000 hours of real-world data from 9 popular dual-arm robot configurations.
Strong Performance: Achieves clear superiority over competitors on simulation and real-world benchmarks (GM-100 and RoboTwin 2.0).
Training Efficiency: Offers a 1.5 ~ 2.8× speedup over existing VLA-oriented codebases, ensuring it is well-suited for real-world deployment.

Related Models

Model Name	Hugging Face	ModelScope	Description
LingBot-VLA-4B	🤗 lingbot-vla-4b	🤖 lingbot-vla-4b	LingBot-VLA w/o Depth
LingBot-VLA-4B-Depth	🤗 lingbot-vla-4b-depth	🤖 lingbot-vla-4b-depth	LingBot-VLA w/ Depth

Citation

@article{wu2026pragmatic,
  title={A Pragmatic VLA Foundation Model},
  author={Wei Wu and Fan Lu and Yunnan Wang and Shuai Yang and Shi Liu and Fangjing Wang and Shuailei Ma and He Sun and Yong Wang and Zhenqi Qiu and Houlong Xiong and Ziyu Wang and Shuai Zhou and Yiyu Ren and Kejia Zhang and Hui Yu and Jingmei Zhao and Qian Zhu and Ran Cheng and Yong-Lu Li and Yongtao Huang and Xing Zhu and Yujun Shen and Kecheng Zheng},
  journal={arXiv preprint arXiv:2601.18692},
  year={2026}
}

License Agreement

This project is licensed under the Apache-2.0 License.

Acknowledgement

This codebase is built on the VeOmni and LeRobot projects. We thank the authors for their excellent work!