tPi0.5-libero / README.md

william-yue

Update README.md

bcda19c verified 26 days ago

preview code

raw

history blame contribute delete

1.96 kB

metadata

library_name: opentau
tags:
  - robotics
  - vla
  - pi05
  - libero
  - manipulation
  - flow-matching
  - pytorch
base_model: williamyue/pi05_base
license: apache-2.0
datasets:
  - physical-intelligence/libero
repo_url: https://github.com/TensorAuto/OpenTau

tPi0.5-libero

A pi0.5 (π₀.₅) Vision-Language-Action (VLA) model, finetuned on the LIBERO robotic manipulation benchmark using the OpenTau training framework. This model is designed to follow natural language instructions to perform manipulation tasks in a simulated tabletop environment.

For full documentation, evaluation results, and inference code, please visit the repository:
👉 https://github.com/TensorAuto/OpenTau

Model Details

Description

Model Type: Vision-Language-Action (VLA) Model
Base Architecture: π₀.₅ (pi0.5) by Physical Intelligence
Backbone: PaliGemma-3B (VLM) + Gemma-300M (Action Expert)
Training Data: LIBERO (Lifelong Robot Learning) Benchmark
Framework: OpenTau

Architecture

The pi0.5 architecture uses a flow-matching-based policy designed for open-world generalization. It combines a Visual Language Model (VLM) for high-level semantic understanding with a smaller "action expert" model that generates continuous joint trajectories (10-step action chunks) via flow matching.

Training and Evaluation

Dataset

This model was finetuned on the LIBERO benchmark dataset. The LIBERO suite consists of human-teleoperated demonstrations for tabletop manipulation, covering:

Spatial Generalization (libero_spatial)
Object Generalization (libero_object)
Goal Generalization (libero_goal)
Long-Horizon Tasks (libero_10)

Results

For detailed usage instructions, success rates, baseline comparisons, and evaluation protocols, please refer to the OpenTau GitHub Repository.