Add robotics pipeline tag and improve model card
#1
by nielsr HF Staff - opened
README.md
CHANGED
|
@@ -1,6 +1,8 @@
|
|
| 1 |
---
|
| 2 |
license: mit
|
|
|
|
| 3 |
---
|
|
|
|
| 4 |
<div align="center">
|
| 5 |
|
| 6 |
<p align="center">
|
|
@@ -21,7 +23,13 @@ license: mit
|
|
| 21 |
|
| 22 |
## Model Description
|
| 23 |
|
| 24 |
-
**LabVLA** is the first vision–language–action (VLA) model designed for scientific laboratory environments
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 25 |
|
| 26 |
## How to Use
|
| 27 |
|
|
@@ -41,4 +49,19 @@ cd LabVLA
|
|
| 41 |
bash deployment/deploy.sh
|
| 42 |
```
|
| 43 |
|
| 44 |
-
For training, data preparation, and more details, please refer to
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
---
|
| 2 |
license: mit
|
| 3 |
+
pipeline_tag: robotics
|
| 4 |
---
|
| 5 |
+
|
| 6 |
<div align="center">
|
| 7 |
|
| 8 |
<p align="center">
|
|
|
|
| 23 |
|
| 24 |
## Model Description
|
| 25 |
|
| 26 |
+
**LabVLA** is the first vision–language–action (VLA) model designed specifically for scientific laboratory environments, as introduced in [LabVLA: Grounding Vision-Language-Action Models in Scientific Laboratories](https://huggingface.co/papers/2606.13578).
|
| 27 |
+
|
| 28 |
+
It combines a **Qwen3-VL-4B-Instruct** vision–language backbone with a **DiT flow-matching action expert**. The model is trained using a two-stage recipe:
|
| 29 |
+
1. **FAST action token pretraining**: Makes the backbone action-aware.
|
| 30 |
+
2. **Flow matching posttraining**: Attaches the DiT action expert under knowledge insulation to enable continuous control.
|
| 31 |
+
|
| 32 |
+
LabVLA addresses the gap in existing policies that are mostly trained on household data, enabling autonomous execution of scientific protocols involving laboratory instruments and transparent liquids.
|
| 33 |
|
| 34 |
## How to Use
|
| 35 |
|
|
|
|
| 49 |
bash deployment/deploy.sh
|
| 50 |
```
|
| 51 |
|
| 52 |
+
For training, data preparation, and more details, please refer to the [GitHub repository](https://github.com/zjunlp/LabVLA).
|
| 53 |
+
|
| 54 |
+
## Citation
|
| 55 |
+
|
| 56 |
+
```bibtex
|
| 57 |
+
@article{ren2026labvla,
|
| 58 |
+
title = {LabVLA: Grounding Vision-Language-Action Models in Scientific Laboratories},
|
| 59 |
+
author = {Ren, Baochang and Liu, Xinjie and Chen, Xi and Liu, Yanshuo and
|
| 60 |
+
Li, Chenxi and Gao, Daqi and Su, Zeqin and Xing, Jintao and
|
| 61 |
+
Xue, Zirui and Li, Rui and Zhao, Xiangyu and Qiao, Shuofei and
|
| 62 |
+
Pan, Minting and Zuo, Wangmeng and Bai, Lei and Zhou, Dongzhan and
|
| 63 |
+
Zhang, Ningyu and Chen, Huajun},
|
| 64 |
+
journal = {arXiv preprint arXiv:2606.13578},
|
| 65 |
+
year = {2026}
|
| 66 |
+
}
|
| 67 |
+
```
|