TensorAuto
/

Robocasa_navigatekitchen

Model card Files Files and versions

Robocasa_navigatekitchen / README.md

ML-GOD's picture

Create README.md

5deec1b verified 6 days ago

|

history blame contribute delete

1.95 kB

	---
	library_name: opentau
	tags:
	- robotics
	- vla
	- pi05
	- robocasa
	- manipulation
	- flow-matching
	- pytorch
	base_model: williamyue/pi05_base
	license: apache-2.0
	datasets:
	- robocasa/NavigateKitchen
	repo_url: https://github.com/TensorAuto/OpenTau
	---

	# Robocasa_navigatekitchen

	A pi0.5 (π₀.₅) Vision-Language-Action (VLA) model, finetuned on the ROBOCASA robotic manipulation/navigation benchmark using the OpenTau training framework. This model is designed to follow natural language instructions to perform navigation tasks in a simulated kitchen environment.

	For full documentation, evaluation results, and inference code, please visit the repository:
	<br>
	👉 [https://github.com/TensorAuto/OpenTau](https://github.com/TensorAuto/OpenTau)

	---

	## Model Details

	### Description
	- Model Type: Vision-Language-Action (VLA) Model
	- Base Architecture: π₀.₅ (pi0.5) by Physical Intelligence
	- Backbone: PaliGemma-3B (VLM) + Gemma-300M (Action Expert)
	- Training Data: Robocasa Benchmark
	- Framework: OpenTau

	### Architecture
	The pi0.5 architecture uses a flow-matching-based policy designed for open-world generalization. It combines a Visual Language Model (VLM) for high-level semantic understanding with a smaller "action expert" model that generates continuous joint trajectories (10-step action chunks) via flow matching.

	---

	## Training and Evaluation

	### Dataset
	This model was finetuned on the Robocasa benchmark dataset. The Robocasa suite consists of human-teleoperated and mimicgen demonstrations for manipulation and navigation, covering:
	- Navigate Kitchen (Atomic)

	### Results
	Training on 100 Human demonstrations, our model achieves 97% success rate on Navigate Kitchen tasks.
	For detailed usage instructions, success rates, baseline comparisons, and evaluation protocols, please refer to the [OpenTau GitHub Repository](https://github.com/TensorAuto/OpenTau).