| # Sim2Sim2Sim Checkpoints |
|
|
| Official pretrained checkpoints for [Dynamics Distillation for Efficient and Transferable Control Learning](https://arxiv.org/abs/2605.01516). |
|
|
| ## Directory Structure |
|
|
| ``` |
| dynamics_model/ # Stage 1: Learned dynamics models |
| βββ beamng_bicycle_*/ # Bicycle model variants |
| βββ beamng_ddm_*/ # Deep Dynamics Model (DDM) variants |
| βββ beamng_trans_*/ # Transformer-based models |
| βββ beamng_dytr_*/ # DYTR (Residual Learning) models |
| βββ beamng_manual_PID_*/ # Manual PID control baselines |
| |
| control_policies/ # Stage 2 & 3: Trained RL policies |
| βββ PPO____*_bicycle/ # Policies trained on bicycle dynamics |
| βββ PPO____*_ddm/ # Policies trained on DDM dynamics |
| βββ PPO____*_trans/ # Policies trained on Transformer dynamics |
| βββ PPO____*_dytr_ddm/ # Policies trained on DYTR dynamics |
| βββ PPO____*_oracle/ # Oracle policies (full-state observation) |
| ``` |
|
|
| ## Usage |
|
|
| ### Download Specific Model |
|
|
| ```python |
| from huggingface_hub import hf_hub_download |
| |
| # Download a dynamics model |
| dynamics_model = hf_hub_download( |
| repo_id="alfredgu001324/Sim2Sim2Sim", |
| filename="dynamics_model/beamng_trans_10/best_model.pt" |
| ) |
| |
| # Download a trained policy |
| policy = hf_hub_download( |
| repo_id="alfredgu001324/Sim2Sim2Sim", |
| filename="control_policies/PPO____R_80000__11_25_10_41_44_840_trans/model_PPO____R_80000__11_25_10_41_44_840_001280.pt" |
| ) |
| ``` |
|
|
| ### Batch Download |
|
|
| ```python |
| from huggingface_hub import snapshot_download |
| |
| # Download all checkpoints |
| local_dir = snapshot_download( |
| repo_id="alfredgu001324/Sim2Sim2Sim", |
| cache_dir="./ckpts", |
| repo_type="model" |
| ) |
| ``` |
|
|
| ## Model Details |
|
|
| ### Dynamics Models (Stage 1) |
|
|
| - **Bicycle**: Simple analytical model serving as baseline |
| - **DDM**: Deep Dynamics Model - neural network trained on BeamNG data |
| - **Transformer**: Sequence-aware dynamics model using transformer architecture |
| - **DYTR**: Residual Learning towards High-fidelity Vehicle Dynamics Modeling with Transformer |
|
|
| ### Control Policies (Stage 2 & 3) |
|
|
| All policies trained using PPO with 80,000 environment steps: |
| - **trans**: Trained on Transformer dynamics model |
| - **ddm**: Trained on DDM dynamics model |
| - **dytr_ddm**: Trained on DYTR-wrapped DDM dynamics model |
| - **bicycle**: Trained on bicycle model |
| - **oracle**: Full-state observation policies |
| - **FF**: Feed-forward policies |
| - **trans_cond**: Transformer with condition encoding for surface changes |
|
|
| ## Citation |
|
|
| If you use these checkpoints, please cite the paper: |
|
|
| ```bibtex |
| @article{GuChittaEtAl2026, |
| author = {Gu, Xunjiang and Chitta, Kashyap and Golchoubian, Mahsa and Suplin, Vladimir and Gilitschenski, Igor}, |
| title = {Dynamics Distillation for Efficient and Transferable Control Learning}, |
| journal = {arXiv preprint arXiv:2605.01516}, |
| year = {2026} |
| } |
| ``` |
|
|
| ## License |
|
|
| Apache 2.0 |
|
|