---
title: VLA-Arena
emoji: 🤖
colorFrom: blue
colorTo: indigo
sdk: static
pinned: false
---
## 📖 About VLA-Arena
**VLA-Arena** is an open-source benchmark designed for the systematic evaluation of Vision-Language-Action (VLA) models. It provides a complete and unified toolchain covering scene modeling, demonstration collection, model training, and evaluation.
Featuring **150+ tasks** across **11 specialized suites**, VLA-Arena assesses models through hierarchical difficulty levels (L0-L2) to ensure comprehensive metrics for safety, generalization, and efficiency.
## 🗝️ Key Evaluation Domains
VLA-Arena focuses on four critical dimensions to ensure robotic agents can operate effectively in the real world:
- **🛡️ Safety**: Evaluate the ability to operate reliably in the physical world while avoiding static/dynamic obstacles and hazards.
- **🔄 Distractor**: Assess performance stability when facing environmental unpredictability and visual clutter.
- **🎯 Extrapolation**: Test the ability to generalize learned knowledge to novel situations, unseen objects, and new workflows.
- **📈 Long Horizon**: Challenge agents to combine long sequences of actions to achieve complex, multi-step goals.
## 🔥 Highlights
- **End-to-End Toolchain**: From scene construction to final evaluation metrics.
- **Systematic Difficulty Scaling**: Tasks range from basic object manipulation (L0) to complex, constraint-heavy scenarios (L2).
- **Flexible Customization**: Powered by CBDDL (Constrained Behavior Domain Definition Language) for easy task definition.
## 🔗 Resources
* **GitHub Repository**: [PKU-Alignment/VLA-Arena](https://github.com/PKU-Alignment/VLA-Arena)
* **Documentation**: [Read the Docs](https://github.com/PKU-Alignment/VLA-Arena/tree/main/docs)
* **License**: Apache 2.0
---