cagataydev commited on Feb 24

Commit

94c56bc

verified ·

1 Parent(s): 8c0f9bf

SAC G1 balancing policy - 1.91M steps, learning to balance

Browse files

Files changed (24) hide show

.gitattributes +1 -0
README.md +102 -0
best/best_model.zip +3 -0
checkpoints/sac_g1_1000000_steps.zip +3 -0
checkpoints/sac_g1_100000_steps.zip +3 -0
checkpoints/sac_g1_1100000_steps.zip +3 -0
checkpoints/sac_g1_1200000_steps.zip +3 -0
checkpoints/sac_g1_1300000_steps.zip +3 -0
checkpoints/sac_g1_1400000_steps.zip +3 -0
checkpoints/sac_g1_1500000_steps.zip +3 -0
checkpoints/sac_g1_1600000_steps.zip +3 -0
checkpoints/sac_g1_1700000_steps.zip +3 -0
checkpoints/sac_g1_1800000_steps.zip +3 -0
checkpoints/sac_g1_1900000_steps.zip +3 -0
checkpoints/sac_g1_200000_steps.zip +3 -0
checkpoints/sac_g1_300000_steps.zip +3 -0
checkpoints/sac_g1_400000_steps.zip +3 -0
checkpoints/sac_g1_500000_steps.zip +3 -0
checkpoints/sac_g1_600000_steps.zip +3 -0
checkpoints/sac_g1_700000_steps.zip +3 -0
checkpoints/sac_g1_800000_steps.zip +3 -0
checkpoints/sac_g1_900000_steps.zip +3 -0
g1_balancing.mp4 +3 -0
logs/evaluations.npz +3 -0

.gitattributes CHANGED Viewed

@@ -33,3 +33,4 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
 *.zip filter=lfs diff=lfs merge=lfs -text
 *.zst filter=lfs diff=lfs merge=lfs -text
 *tfevents* filter=lfs diff=lfs merge=lfs -text

 *.zip filter=lfs diff=lfs merge=lfs -text
 *.zst filter=lfs diff=lfs merge=lfs -text
 *tfevents* filter=lfs diff=lfs merge=lfs -text
+g1_balancing.mp4 filter=lfs diff=lfs merge=lfs -text

README.md ADDED Viewed

	@@ -0,0 +1,102 @@

+---
+tags:
+  - reinforcement-learning
+  - robotics
+  - mujoco
+  - locomotion
+  - unitree
+  - g1
+  - humanoid
+  - sac
+  - stable-baselines3
+  - strands-robots
+library_name: stable-baselines3
+model-index:
+  - name: SAC-Unitree-G1-MuJoCo
+    results:
+      - task:
+          type: reinforcement-learning
+          name: Humanoid Locomotion
+        dataset:
+          type: custom
+          name: MuJoCo LocomotionEnv
+        metrics:
+          - type: mean_reward
+            value: 530
+            name: Best Mean Reward
+          - type: mean_distance
+            value: 2.65
+            name: Mean Forward Distance (m)
+---
+# SAC Unitree G1 — MuJoCo Locomotion Policy
+A **Soft Actor-Critic (SAC)** policy trained for the Unitree G1 humanoid in MuJoCo simulation. Currently **learning to balance** — stays upright ~4 seconds and stumbles forward.
+Trained entirely on a MacBook (CPU, no GPU, no Isaac Gym) using [strands-robots](https://github.com/cagataycali/strands-gtc-nvidia).
+## Results
+| Metric | Value |
+|--------|-------|
+| Algorithm | SAC (Soft Actor-Critic) |
+| Training steps | 1.91M |
+| Training time | ~60 min (MacBook M-series, CPU) |
+| Parallel envs | 8 |
+| Network | MLP [256, 256] |
+| Best reward | **530** |
+| Mean distance | **2.65m** |
+| Episode length | ~200/1,000 (~4 seconds upright) |
+| Status | Balancing + stumbling forward |
+## Demo Video
+See `g1_balancing.mp4` — the G1 attempting to balance and walk in MuJoCo.
+## Why It's Hard
+The G1 has **29 DOF** vs Go2's 12. Bipedal balance is fundamentally harder — the robot must coordinate hip, knee, ankle, and torso simultaneously while maintaining a tiny support polygon.
+With more training (~5-10M steps, ~3 hours), it should learn to walk.
+## Usage
+```python
+from stable_baselines3 import SAC
+model = SAC.load("best/best_model")
+obs, _ = env.reset()
+for _ in range(1000):
+    action, _ = model.predict(obs, deterministic=True)
+    obs, reward, done, truncated, info = env.step(action)
+```
+## Reward Function
+```
+reward = forward_vel × 5.0       # primary: move forward
+       + alive_bonus × 1.0       # stay upright
+       + upright_reward × 0.3    # orientation bonus
+       - ctrl_cost × 0.001       # minimize energy
+       - lateral_penalty × 0.3   # don't drift sideways
+       - smoothness × 0.0001     # discourage jerky motion
+```
+## Files
+- `best/best_model.zip` — Best checkpoint
+- `checkpoints/` — All 100K-step checkpoints
+- `logs/evaluations.npz` — Evaluation metrics
+- `g1_balancing.mp4` — Demo video
+## Environment
+- **Simulator**: MuJoCo (via mujoco-python)
+- **Robot**: Unitree G1 (29 DOF) from MuJoCo Menagerie
+- **Observation**: joint positions, velocities, torso orientation, height (87-dim)
+- **Action**: joint torques (29-dim, continuous)
+## License
+Apache-2.0

best/best_model.zip ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:9cb15b7292a646e6006013c0f5a4690d1ad5027083a77891607b123e56946d95
+size 4149867

checkpoints/sac_g1_1000000_steps.zip ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:8ab5b8bfceab619c0db98044ad448cce0c0989b0e7e762622d18c643a03dbb6d
+size 4149858

checkpoints/sac_g1_100000_steps.zip ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:9e83947df6948d3ba27c73adda5085f211ed7860016b6a25874ae29db20b4dc0
+size 4149855

checkpoints/sac_g1_1100000_steps.zip ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:d3f1de8747ddfea37f610a9c0ee69cb16c123a365a1fe75a85d0a4f35d0cc315
+size 4149858

checkpoints/sac_g1_1200000_steps.zip ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:0a88d00ed69afb4ee86e351645de936b8355ce92720e51a9c5782c34fa8fbed4
+size 4149858

checkpoints/sac_g1_1300000_steps.zip ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:be930413a97e485059f7aa7450f7ee0cd540c51c14447139d2bd90e60931e993
+size 4149858

checkpoints/sac_g1_1400000_steps.zip ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:2511b3164819d1fd317a31628911cfbbd905f41681c7644631346612abce2daa
+size 4149858

checkpoints/sac_g1_1500000_steps.zip ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:d29e8b66d4fea3c6387bbf269e770e508991888b1f998384c7ac3dcd78adac01
+size 4149858

checkpoints/sac_g1_1600000_steps.zip ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:2cef573f5af407e987cf156b18334bc910e09c7b1e9ffd05f44757ae2874dd2e
+size 4149858

checkpoints/sac_g1_1700000_steps.zip ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:a4b6d1081f3fc8d7e0d1211baf3046a0608b0286f8927330cbff2b57d4f91bf6
+size 4149858

checkpoints/sac_g1_1800000_steps.zip ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:036be3dfd0bebdd11877c3317efbfc7a642e78f65c135d430d9e7232d32c7e42
+size 4149858

checkpoints/sac_g1_1900000_steps.zip ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:a51f5d225483aca91b3756396a1240db6223538eefe9ef5cf137390e48003927
+size 4149867

checkpoints/sac_g1_200000_steps.zip ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:697011f06fc560071e62ad698367474c9826628b10edcef70f3c54d7d1340cac
+size 4149855

checkpoints/sac_g1_300000_steps.zip ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:4280e1c4d69f76147992f78ce7435563e45f48df9bb9ef42e9c340cf249b5750
+size 4149856

checkpoints/sac_g1_400000_steps.zip ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:0812dc1a31801935b29fe168a51ab6e4777011c94c2f3f9fcd36069bc92d86b1
+size 4149856

checkpoints/sac_g1_500000_steps.zip ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:1b4a8321660b6772888b08e7efba0cd814b4961d2ff1e609844e3487fa53c6f9
+size 4149856

checkpoints/sac_g1_600000_steps.zip ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:b0c0c81e54c14bff79f44c526101618abae0495f33108faff0f2b2f83bc37993
+size 4149865

checkpoints/sac_g1_700000_steps.zip ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:a2dad3e970cb109a6882d5699d9ec66e4a6bd37286750ec5628f03451cc10975
+size 4149856

checkpoints/sac_g1_800000_steps.zip ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:e8f750b460f93cbf527c4a2c837fe80dfb1b435f45fd4e2759b5b7833af6da5a
+size 4149856

checkpoints/sac_g1_900000_steps.zip ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:8b1926c26473d97870cfe4551ddd9219dbfc28be55bc326dad4c4c7d482c5218
+size 4149857

g1_balancing.mp4 ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:f182d7a40f3f33391ad1e1d3a627d4f713ab90625afb0c6bf0caa36c27d59d80
+size 1532731

logs/evaluations.npz ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:b70027a19be27c7a415624c0c1605547588dfeaa27feb1c1201b5c3ff40c3ac2
+size 17578