TensorAuto
/

moka_pot_RECAP_R0

Reinforcement Learning

Model card Files Files and versions

ML-GOD commited on Mar 17

Commit

f23732c

·

verified ·

1 Parent(s): 92b11cc

Update README.md

Files changed (1) hide show

README.md +51 -3

README.md CHANGED Viewed

@@ -1,3 +1,51 @@
----
-license: apache-2.0
----

+---
+library_name: opentau
+tags:
+- robotics
+- vla
+- pi0
+- libero
+- Reinforcement Learning
+- manipulation
+- flow-matching
+- pytorch
+license: apache-2.0
+datasets:
+- physical-intelligence/libero
+repo_url: https://github.com/TensorAuto/OpenTau
+---
+# moka_pot_RECAP_R0
+A **pi0 (π₀) RECAP** Vision-Language-Action (VLA) model, finetuned on the **LIBERO** robotic manipulation benchmark using the **OpenTau** training framework. This model is designed to follow natural language instructions to perform manipulation tasks in a simulated tabletop environment.
+Achieves **~89% success rate** measured over **320 episodes**.
+**For full documentation, evaluation results, and inference code, please visit the repository:**
+<br>
+👉 **[https://github.com/TensorAuto/OpenTau](https://github.com/TensorAuto/OpenTau)**
+---
+## Model Details
+### Description
+- **Model Type:** Vision-Language-Action (VLA) Model
+- **Base Architecture:** π₀ (pi0) by Physical Intelligence
+- **Backbone:** PaliGemma-3B (VLM) + Gemma-300M (Action Expert) + RL indicator
+- **Training Data:** Moka Pot Task on LIBERO (Lifelong Robot Learning) Benchmark
+- **Framework:** OpenTau
+### Architecture
+The **PI0 RECAP** architecture uses a flow-matching and Reinforcement Learning policy designed for open-world generalization. It combines a Visual Language Model (VLM) for high-level semantic understanding with a smaller "action expert" model that generates continuous joint trajectories (10-step action chunks) via flow matching. It uses RL to learn from good and bad episodes
+---
+## Training and Evaluation
+The Advantage Indicator (It) was set to True for only 10% of datapoints.
+### Dataset
+This model was finetuned on the **Moka Pot task in LIBERO 10** benchmark dataset and autonomous rollouts. It consists of around 29 expert teleoperated episodes and 212 autonomous rollouts under moka_pot_libero_sft policy.
+### Results
+For detailed usage instructions, success rates, baseline comparisons, and evaluation protocols, please refer to the [OpenTau GitHub Repository](https://github.com/TensorAuto/OpenTau).
+Achieves **~89% success rate** measured over **320 episodes**.