wsagi
/

HumanoidBench-DrQ

@@ -18,6 +18,10 @@ license: mit
 _Self-trained DR.Q checkpoints that **beat** the public dmux/DR.Q baseline on HumanoidBench locomotion tasks._
 DR.Q 是 TD3 + model-based 表征学习的离策略 RL 算法（encoder + policy ~13 MB 推理）。
 本仓库收录在 [HumanoidBench](https://github.com/carlosferrazza/humanoid-bench) 上**从零自训通关**的 checkpoints。

 _Self-trained DR.Q checkpoints that **beat** the public dmux/DR.Q baseline on HumanoidBench locomotion tasks._
+> 🛠 **训练源码 / Training source**: <https://github.com/vitorcen/humanoid-training>
+> 完整训练脚本、patches、eval harness、分析文档全在 GitHub 配套仓库。
+> _Full training scripts, patches, eval harness, and analysis docs in the companion GitHub repo._
 DR.Q 是 TD3 + model-based 表征学习的离策略 RL 算法（encoder + policy ~13 MB 推理）。
 本仓库收录在 [HumanoidBench](https://github.com/carlosferrazza/humanoid-bench) 上**从零自训通关**的 checkpoints。