Safetensors

Add robotics pipeline tag and improve model card

#1
by nielsr HF Staff - opened
Files changed (1) hide show
  1. README.md +25 -2
README.md CHANGED
@@ -1,6 +1,8 @@
1
  ---
2
  license: mit
 
3
  ---
 
4
  <div align="center">
5
 
6
  <p align="center">
@@ -21,7 +23,13 @@ license: mit
21
 
22
  ## Model Description
23
 
24
- **LabVLA** is the first vision–language–action (VLA) model designed for scientific laboratory environments. It combines a **Qwen3-VL-4B-Instruct** vision–language backbone with a **DiT flow-matching action expert**, trained with the π0.5 recipe to enable real-time robot control in lab settings.
 
 
 
 
 
 
25
 
26
  ## How to Use
27
 
@@ -41,4 +49,19 @@ cd LabVLA
41
  bash deployment/deploy.sh
42
  ```
43
 
44
- For training, data preparation, and more details, please refer to our [GitHub repository](https://github.com/zjunlp/LabVLA).
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: mit
3
+ pipeline_tag: robotics
4
  ---
5
+
6
  <div align="center">
7
 
8
  <p align="center">
 
23
 
24
  ## Model Description
25
 
26
+ **LabVLA** is the first vision–language–action (VLA) model designed specifically for scientific laboratory environments, as introduced in [LabVLA: Grounding Vision-Language-Action Models in Scientific Laboratories](https://huggingface.co/papers/2606.13578).
27
+
28
+ It combines a **Qwen3-VL-4B-Instruct** vision–language backbone with a **DiT flow-matching action expert**. The model is trained using a two-stage recipe:
29
+ 1. **FAST action token pretraining**: Makes the backbone action-aware.
30
+ 2. **Flow matching posttraining**: Attaches the DiT action expert under knowledge insulation to enable continuous control.
31
+
32
+ LabVLA addresses the gap in existing policies that are mostly trained on household data, enabling autonomous execution of scientific protocols involving laboratory instruments and transparent liquids.
33
 
34
  ## How to Use
35
 
 
49
  bash deployment/deploy.sh
50
  ```
51
 
52
+ For training, data preparation, and more details, please refer to the [GitHub repository](https://github.com/zjunlp/LabVLA).
53
+
54
+ ## Citation
55
+
56
+ ```bibtex
57
+ @article{ren2026labvla,
58
+ title = {LabVLA: Grounding Vision-Language-Action Models in Scientific Laboratories},
59
+ author = {Ren, Baochang and Liu, Xinjie and Chen, Xi and Liu, Yanshuo and
60
+ Li, Chenxi and Gao, Daqi and Su, Zeqin and Xing, Jintao and
61
+ Xue, Zirui and Li, Rui and Zhao, Xiangyu and Qiao, Shuofei and
62
+ Pan, Minting and Zuo, Wangmeng and Bai, Lei and Zhou, Dongzhan and
63
+ Zhang, Ningyu and Chen, Huajun},
64
+ journal = {arXiv preprint arXiv:2606.13578},
65
+ year = {2026}
66
+ }
67
+ ```