|
|
--- |
|
|
library_name: pytorch |
|
|
tags: |
|
|
- robotics |
|
|
- libero |
|
|
- vision-language-action |
|
|
- imitation-learning |
|
|
- manipulation |
|
|
datasets: |
|
|
- gate-institute/GATE-VLAP-datasets |
|
|
--- |
|
|
|
|
|
# GATE-VLAP: Grounded Action Trajectory Embeddings with Vision-Language Action Planning |
|
|
|
|
|
**Trained on LIBERO-10 Benchmark** |
|
|
|
|
|
This model is trained for robotic manipulation tasks using vision-language-action learning with semantic action chunking. |
|
|
|
|
|
## Model Details |
|
|
|
|
|
- **Architecture**: CLIP-RT (CLIP-based Robot Transformer) |
|
|
- **Training Dataset**: [GATE-VLAP LIBERO-10](https://huggingface.co/datasets/gate-institute/GATE-VLAP-datasets) |
|
|
- **Training Epochs**: 90 |
|
|
- **Task Type**: Long-horizon robotic manipulation |
|
|
- **Input**: RGB images (128Γ128) + language instructions |
|
|
- **Output**: 7-DOF actions (xyz, rpy, gripper) |
|
|
|
|
|
## Training Details |
|
|
|
|
|
- **Dataset**: LIBERO-10 (29 subtasks, 1,354 demonstrations) |
|
|
- **Segmentation**: Semantic action chunking using Gemini Vision API |
|
|
- **Framework**: PyTorch |
|
|
- **Checkpoint**: Epoch 90 (best_epoch) |
|
|
|
|
|
## Performance |
|
|
|
|
|
Training run: `libero_10_fixed_training_v1` |
|
|
|
|
|
*Overall performance accuracy: 88.8 % task success rate => 5 % better than raw CLIP-RT on LIBERO-LONG* |
|
|
|
|
|
## Dataset |
|
|
|
|
|
This model was trained on the [GATE-VLAP Datasets](https://huggingface.co/datasets/gate-institute/GATE-VLAP-datasets), which includes: |
|
|
- LIBERO-10: 103,650 frames across 29 subtasks |
|
|
- Semantic action segmentation |
|
|
- Vision-language annotations |
|
|
|
|
|
## Citation |
|
|
|
|
|
```bibtex |
|
|
@article{gateVLAP@SAC2026, |
|
|
title={Atomic Action Slicing: Planner-Aligned Options for Generalist VLA Agents}, |
|
|
author={Stefan Tabakov, Asen Popov, Dimitar Dimitrov, Ensiye Kiyamousavi and Boris Kraychev}, |
|
|
journal={arXiv preprint arXiv:XXXX.XXXXX}, |
|
|
conference={The 41st ACM/SIGAPP Symposium On Applied Computing (SAC2026), track on Intelligent Robotics and Multi-Agent Systems (IRMAS)}, |
|
|
year={2025} |
|
|
} |
|
|
``` |
|
|
|
|
|
## Maintainer |
|
|
|
|
|
[**GATE Institute**](https://www.gate-ai.eu/en/home/) - Advanced AI Research Group, Sofia, Bulgaria |
|
|
|
|
|
## Links |
|
|
|
|
|
- π€ **Dataset**: [gate-institute/GATE-VLAP-datasets](https://huggingface.co/datasets/gate-institute/GATE-VLAP-datasets) |
|
|
- π **Paper**: *Coming soon* |
|
|
- π» **Code**: *Coming soon* |
|
|
|
|
|
|