| --- |
| license: apache-2.0 |
| language: |
| - en |
| datasets: |
| - nvidia/PhysicalAI-Robotics-GR00T-Teleop-Sim |
| --- |
| |
| # DIAL Checkpoints |
|
|
| <p align="center"> |
| <a href="https://xpeng-robotics.github.io/dial/"><b>Project Page</b></a> | |
| <a href="https://xpeng-robotics.github.io/dial/DIAL.pdf"><b>Paper</b></a> | |
| <a href="https://github.com/xpeng-robotics/DIAL"><b>Code</b></a> |
| </p> |
|
|
| Model weights for **DIAL** (**D**ecoupling **I**ntent and **A**ction via **L**atent World Modeling), an end-to-end Vision-Language-Action (VLA) framework built on [NVIDIA Isaac GR00T N1.5](https://github.com/NVIDIA/Isaac-GR00T/tree/n1.5-release) with a [Qwen2.5-VL-3B-Instruct](https://huggingface.co/Qwen/Qwen2.5-VL-3B-Instruct) backbone. |
|
|
|
|
| ## Available Checkpoints |
|
|
| | Checkpoint | Training Data | Steps | Description | |
| |---|---|---|---| |
| | `DIAL-3B-fewshot` | EgoDex human data + 10% GR1 simulation data | 20K per stage (3-stage) | Co-trained with heterogeneous human demonstrations | |
| | `DIAL-3B-fulldata` | All GR1 simulation data (~24,000 demos) | 40K per stage (2-stage) | Trained on full teleoperation trajectories in simulation | |
|
|
| For installation, training, and evaluation instructions, please refer to the [GitHub repository](https://github.com/xpeng-robotics/DIAL). |