File size: 3,712 Bytes
d53bd47
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
---
license: apache-2.0
tags:
  - robotics
  - navigation
  - reinforcement-learning
  - imitation-learning
  - dagger
  - humanoid
  - unitree-g1
  - isaaclab
  - sim-to-real
language:
  - en
pipeline_tag: reinforcement-learning
---

# G1 Navigation Policy (DAgger Distillation)

Vision-based navigation policies for the **Unitree G1 humanoid robot**, trained using teacher-student DAgger distillation.

## Model Description

This repository contains two PyTorch TorchScript models:

| Model | File | Input Dim | Output Dim | Description |
|-------|------|-----------|------------|-------------|
| **Student** | `student_policy.pt` | 82 | 3 | Deployable policy using depth rays from monocular depth estimation |
| **Teacher** | `teacher_policy.pt` | 34 | 3 | Privileged policy with ground-truth obstacle positions |

### Architecture

Both policies use a 3-layer MLP with ELU activations:
- **Student**: [82] β†’ 256 β†’ 128 β†’ 64 β†’ [3]
- **Teacher**: [34] β†’ 256 β†’ 128 β†’ 64 β†’ [3]

### Observation Spaces

**Student (82-dim):**
- Depth rays: 72 dims (Β±70Β° FOV, corrupted with noise/dropout)
- Robot velocity (vx, vy, Ο‰): 3 dims
- Goal relative position: 2 dims
- Goal distance & angle: 2 dims
- Previous action: 3 dims

**Teacher (34-dim):**
- Nearest obstacles: 8 Γ— 3 = 24 dims (x, y, distance)
- Robot velocity: 3 dims
- Goal relative + distance/angle: 4 dims
- Previous action: 3 dims

### Action Space

Velocity commands: `[vx, vy, Ο‰]`
- `vx ∈ [-0.6, 1.0]` m/s (forward/backward)
- `vy ∈ [-0.5, 0.5]` m/s (lateral)
- `Ο‰ ∈ [-1.57, 1.57]` rad/s (yaw rate)

## Training Details

### Two-Stage Pipeline

1. **Stage 1: Teacher PPO** - Train privileged teacher with ground-truth obstacles using PPO (2000 iterations)
2. **Stage 2: DAgger Distillation** - Distill teacher to student using Dataset Aggregation with 70% β†’ 20% teacher mixing decay

### Key Innovations

- **FOV Randomization**: `fov_keep_ratio ∈ [0.35, 1.0]` prevents sensor overfitting
- **Hardcase Curriculum**: Mine failure trajectories, retrain with 35% hardcase resets
- **Symmetry Augmentation**: 50% mirror transform eliminates left/right bias
- **Runtime Safety Layer**: Distance-based velocity scaling for collision avoidance

### Performance

| Scenario | Success Rate | Collision Rate |
|----------|-------------|----------------|
| Deploy (mild noise) | 75.0% | 25.0% |
| Stress (heavy noise) | 75.2% | 24.8% |
| Wide-FOV Clean | 74.4% | 25.6% |

### Real Robot Validation

| Direction | Target | Final Distance | Result |
|-----------|--------|----------------|--------|
| Forward | 2.0m | 0.26m | βœ… SUCCESS |
| Backward | -1.5m | 0.26m | βœ… SUCCESS |
| Left | 1.5m | 0.31m | βœ… SUCCESS |
| Right | -2.0m | 0.26m | βœ… SUCCESS |
| Diagonal | (1.5, 1.5)m | 0.30m | βœ… SUCCESS |

## Usage

```python
import torch

# Load student policy (for deployment)
student = torch.jit.load("student_policy.pt")
student.eval()

# Prepare observation (82-dim)
obs = torch.zeros(1, 82)  # [depth_rays(72), vel(3), goal_rel(2), goal_dist_angle(2), prev_action(3)]

# Get action
with torch.no_grad():
    action = student(obs)  # [vx, vy, omega]
```

## Training Environment

- **Simulator**: NVIDIA IsaacLab (Isaac Sim 4.5)
- **Arena**: 8m Γ— 8m with 24-32 cylindrical obstacles
- **Control Rate**: 10 Hz policy / 50 Hz physics
- **Robot**: Unitree G1 (capsule proxy in sim, full robot for real deployment)

## Citation

If you use this model, please cite:

```bibtex
@misc{g1-navigation-dagger-2026,
  title={Teacher-Student Distillation via DAgger for Sim-to-Real Navigation on the Unitree G1},
  author={Adjimavo},
  year={2026},
  url={https://huggingface.co/Adjimavo/g1-navigation-dagger}
}
```

## License

Apache 2.0