File size: 1,945 Bytes
5deec1b | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 | ---
library_name: opentau
tags:
- robotics
- vla
- pi05
- robocasa
- manipulation
- flow-matching
- pytorch
base_model: williamyue/pi05_base
license: apache-2.0
datasets:
- robocasa/NavigateKitchen
repo_url: https://github.com/TensorAuto/OpenTau
---
# Robocasa_navigatekitchen
A **pi0.5 (π₀.₅)** Vision-Language-Action (VLA) model, finetuned on the **ROBOCASA** robotic manipulation/navigation benchmark using the **OpenTau** training framework. This model is designed to follow natural language instructions to perform navigation tasks in a simulated kitchen environment.
**For full documentation, evaluation results, and inference code, please visit the repository:**
<br>
👉 **[https://github.com/TensorAuto/OpenTau](https://github.com/TensorAuto/OpenTau)**
---
## Model Details
### Description
- **Model Type:** Vision-Language-Action (VLA) Model
- **Base Architecture:** π₀.₅ (pi0.5) by Physical Intelligence
- **Backbone:** PaliGemma-3B (VLM) + Gemma-300M (Action Expert)
- **Training Data:** Robocasa Benchmark
- **Framework:** OpenTau
### Architecture
The pi0.5 architecture uses a flow-matching-based policy designed for open-world generalization. It combines a Visual Language Model (VLM) for high-level semantic understanding with a smaller "action expert" model that generates continuous joint trajectories (10-step action chunks) via flow matching.
---
## Training and Evaluation
### Dataset
This model was finetuned on the **Robocasa** benchmark dataset. The Robocasa suite consists of human-teleoperated and mimicgen demonstrations for manipulation and navigation, covering:
- **Navigate Kitchen** (Atomic)
### Results
Training on 100 Human demonstrations, our model achieves **97%** success rate on Navigate Kitchen tasks.
For detailed usage instructions, success rates, baseline comparisons, and evaluation protocols, please refer to the [OpenTau GitHub Repository](https://github.com/TensorAuto/OpenTau). |