--- title: VLA-Arena emoji: 🤖 colorFrom: blue colorTo: indigo sdk: static pinned: false ---
## 📖 About VLA-Arena **VLA-Arena** is an open-source benchmark designed for the systematic evaluation of Vision-Language-Action (VLA) models. It provides a complete and unified toolchain covering scene modeling, demonstration collection, model training, and evaluation. Featuring **150+ tasks** across **11 specialized suites**, VLA-Arena assesses models through hierarchical difficulty levels (L0-L2) to ensure comprehensive metrics for safety, generalization, and efficiency. ## 🗝️ Key Evaluation Domains VLA-Arena focuses on four critical dimensions to ensure robotic agents can operate effectively in the real world: - **🛡️ Safety**: Evaluate the ability to operate reliably in the physical world while avoiding static/dynamic obstacles and hazards. - **🔄 Distractor**: Assess performance stability when facing environmental unpredictability and visual clutter. - **🎯 Extrapolation**: Test the ability to generalize learned knowledge to novel situations, unseen objects, and new workflows. - **📈 Long Horizon**: Challenge agents to combine long sequences of actions to achieve complex, multi-step goals. ## 🔥 Highlights - **End-to-End Toolchain**: From scene construction to final evaluation metrics. - **Systematic Difficulty Scaling**: Tasks range from basic object manipulation (L0) to complex, constraint-heavy scenarios (L2). - **Flexible Customization**: Powered by CBDDL (Constrained Behavior Domain Definition Language) for easy task definition. ## 🔗 Resources * **GitHub Repository**: [PKU-Alignment/VLA-Arena](https://github.com/PKU-Alignment/VLA-Arena) * **Documentation**: [Read the Docs](https://github.com/PKU-Alignment/VLA-Arena/tree/main/docs) * **License**: Apache 2.0 ---
Built with ❤️ by the VLA-Arena Team