README / README.md
jiahaoli2077's picture
Update README.md
acbcc8a verified
metadata
title: VLA-Arena
emoji: πŸ€–
colorFrom: blue
colorTo: indigo
sdk: static
pinned: false

πŸ“– About VLA-Arena

VLA-Arena is an open-source benchmark designed for the systematic evaluation of Vision-Language-Action (VLA) models. It provides a complete and unified toolchain covering scene modeling, demonstration collection, model training, and evaluation.

Featuring 150+ tasks across 11 specialized suites, VLA-Arena assesses models through hierarchical difficulty levels (L0-L2) to ensure comprehensive metrics for safety, generalization, and efficiency.

πŸ—οΈ Key Evaluation Domains

VLA-Arena focuses on four critical dimensions to ensure robotic agents can operate effectively in the real world:

  • πŸ›‘οΈ Safety: Evaluate the ability to operate reliably in the physical world while avoiding static/dynamic obstacles and hazards.
  • πŸ”„ Distractor: Assess performance stability when facing environmental unpredictability and visual clutter.
  • 🎯 Extrapolation: Test the ability to generalize learned knowledge to novel situations, unseen objects, and new workflows.
  • πŸ“ˆ Long Horizon: Challenge agents to combine long sequences of actions to achieve complex, multi-step goals.

πŸ”₯ Highlights

  • End-to-End Toolchain: From scene construction to final evaluation metrics.
  • Systematic Difficulty Scaling: Tasks range from basic object manipulation (L0) to complex, constraint-heavy scenarios (L2).
  • Flexible Customization: Powered by CBDDL (Constrained Behavior Domain Definition Language) for easy task definition.

πŸ”— Resources


Built with ❀️ by the VLA-Arena Team