File size: 2,174 Bytes
cea91ba
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
55a2fbc
cea91ba
 
 
 
 
55a2fbc
cea91ba
 
 
 
 
 
 
 
 
 
 
55a2fbc
 
 
cea91ba
55a2fbc
 
cea91ba
 
 
 
 
a5c04c7
cea91ba
 
 
 
 
55a2fbc
cea91ba
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
---
library_name: pytorch
tags:
- robotics
- libero
- vision-language-action
- imitation-learning
- manipulation
datasets:
- gate-institute/GATE-VLAP-datasets
---

# GATE-VLAP: Grounded Action Trajectory Embeddings with Vision-Language Action Planning

**Trained on LIBERO-10 Benchmark**

This model is trained for robotic manipulation tasks using vision-language-action learning with semantic action chunking.

## Model Details

- **Architecture**: CLIP-RT (CLIP-based Robot Transformer)
- **Training Dataset**: [GATE-VLAP LIBERO-10](https://huggingface.co/datasets/gate-institute/GATE-VLAP-datasets)
- **Training Epochs**: 90
- **Task Type**: Long-horizon robotic manipulation
- **Input**: RGB images (128×128) + language instructions
- **Output**: 7-DOF actions (xyz, rpy, gripper)

## Training Details

- **Dataset**: LIBERO-10 (29 subtasks, 1,354 demonstrations)
- **Segmentation**: Semantic action chunking using Gemini Vision API
- **Framework**: PyTorch
- **Checkpoint**: Epoch 90 (best_epoch)

## Performance

Training run: `libero_10_fixed_training_v1`

*Overall performance accuracy: 88.8 % task success rate => 5 % better than raw CLIP-RT on LIBERO-LONG*

## Dataset

This model was trained on the [GATE-VLAP Datasets](https://huggingface.co/datasets/gate-institute/GATE-VLAP-datasets), which includes:
- LIBERO-10: 103,650 frames across 29 subtasks
- Semantic action segmentation
- Vision-language annotations

## Citation

```bibtex
@article{gateVLAP@SAC2026,
  title={Atomic Action Slicing: Planner-Aligned Options for Generalist VLA Agents},
  author={Stefan Tabakov, Asen Popov, Dimitar Dimitrov, Ensiye Kiyamousavi and Boris Kraychev},
  journal={arXiv preprint arXiv:XXXX.XXXXX},
  conference={The 41st ACM/SIGAPP Symposium On Applied Computing (SAC2026), track on Intelligent Robotics and Multi-Agent Systems (IRMAS)},
  year={2025}
}
```

## Maintainer

[**GATE Institute**](https://www.gate-ai.eu/en/home/) - Advanced AI Research Group, Sofia, Bulgaria

## Links

- 🤗 **Dataset**: [gate-institute/GATE-VLAP-datasets](https://huggingface.co/datasets/gate-institute/GATE-VLAP-datasets)
- 📄 **Paper**: *Coming soon*
- 💻 **Code**: *Coming soon*