YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

Holosoma FastSAC Locomotion Policy Architecture

Model: fastsac_g1_29dof.onnx
Source: Holosoma Whole-Body Locomotion
Repository: nepyope/holosoma_locomotion
Framework: PyTorch 2.4.1 → ONNX (Opset 13)
Total Parameters: 221,727

Overview
Input Space (Observation)
Output Space (Action)
Network Architecture
Forward Pass Data Flow
PD Gains
Comparison with GR00T

Overview

The Holosoma FastSAC policy is a single unified actor network trained using the Fast Soft Actor-Critic (SAC) algorithm for bipedal whole-body locomotion. Unlike GR00T's dual-policy system (balance + walk), Holosoma uses one network to handle all locomotion states.

Key Features

Single frame input: 100D observation (no temporal stacking)
Full body control: All 29 DOF controlled simultaneously
Observation normalization: Running mean/std baked into model
Per-joint action scaling: Action scale factors embedded in ONNX
LayerNorm: Used instead of BatchNorm for stability
Gait phase encoding: Sin/cos phase signals for periodic locomotion

Input Space (Observation)

Observation Vector (100D)

Index	Dim	Content	Scale
[0:29]	29	last_action (unscaled)	1.0
[29:32]	3	angular_velocity (IMU gyro)	0.25
[32]	1	yaw_velocity_command	1.0
[33:35]	2	linear_velocity_command (x, y)	1.0
[35:37]	2	gait_phase_cos	1.0
[37:66]	29	joint_positions (relative to default)	1.0
[66:95]	29	joint_velocities	0.05
[95:98]	3	gravity_orientation	1.0
[98:100]	2	gait_phase_sin	1.0

Total: 100D

Observation Scaling Constants

ANG_VEL_SCALE = 0.25      # Angular velocity scaling
DOF_POS_SCALE = 1.0       # Joint position scaling
DOF_VEL_SCALE = 0.05      # Joint velocity scaling

Input Normalization

The model includes a baked-in observation normalizer applied internally:

# Applied inside ONNX graph
obs_normalized = (obs - obs_mean) / obs_std

# obs_mean: shape (1, 100), range [-0.99, 0.71]
# obs_std:  shape (1, 100), range [0.04, 1.40]

Default Joint Positions (29D Standing Pose)

DEFAULT_ANGLES = np.zeros(29, dtype=np.float32)
DEFAULT_ANGLES[[0, 6]] = -0.312    # hip pitch (left, right)
DEFAULT_ANGLES[[3, 9]] = 0.669     # knee
DEFAULT_ANGLES[[4, 10]] = -0.363   # ankle pitch
DEFAULT_ANGLES[[15, 22]] = 0.2     # shoulder pitch
DEFAULT_ANGLES[16] = 0.2           # left shoulder roll
DEFAULT_ANGLES[23] = -0.2          # right shoulder roll
DEFAULT_ANGLES[[18, 25]] = 0.6     # elbow

Gait Phase

GAIT_PERIOD = 1.0  # seconds
CONTROL_FREQ = 50  # Hz
phase_dt = 2 * π / (CONTROL_FREQ * GAIT_PERIOD)

# Two-phase system (left/right legs offset by π)
phase = [phase_left, phase_right]

# When standing still (no command)
if cmd_magnitude < 0.01:
    phase = [π, π]  # Both legs synchronized (standing)

Output Space (Action)

Action Vector (29D)

All 29 joints are controlled:

Index	Joint Group	DOF
[0:6]	Left leg (hip_pitch, hip_roll, hip_yaw, knee, ankle_pitch, ankle_roll)	6
[6:12]	Right leg	6
[12:15]	Waist (yaw, roll, pitch)	3
[15:22]	Left arm (shoulder_pitch/roll/yaw, elbow, wrist_roll/pitch/yaw)	7
[22:29]	Right arm	7

Total: 29D

Action Scaling & Application

ACTION_SCALE = 0.25  # External scaling applied in control loop

# Target joint positions
target_pos = DEFAULT_ANGLES + action * ACTION_SCALE

Command Ranges

Command	Range
Linear velocity X	[-1.0, 1.0] m/s
Linear velocity Y	[-1.0, 1.0] m/s
Angular velocity (yaw)	[-1.0, 1.0] rad/s
Heading	[-3.14, 3.14] rad

Network Architecture

Overview Diagram

Input (100D)
    │
    ▼
┌─────────────────────────┐
│  Observation Normalizer │  (mean/std normalization)
└─────────────────────────┘
    │
    ▼
┌─────────────────────────┐
│  Linear(100 → 512)      │  51,200 params
│  LayerNorm(512)         │  1,024 params
│  ELU()                  │
└─────────────────────────┘
    │
    ▼
┌─────────────────────────┐
│  Linear(512 → 256)      │  131,072 params
│  LayerNorm(256)         │  512 params
│  ELU()                  │
└─────────────────────────┘
    │
    ▼
┌─────────────────────────┐
│  Linear(256 → 128)      │  32,768 params
│  LayerNorm(128)         │  256 params
│  ELU()                  │
└─────────────────────────┘
    │
    ▼
┌─────────────────────────┐
│  Linear(128 → 29)       │  3,712 params
│  Tanh()                 │
└─────────────────────────┘
    │
    ▼
┌─────────────────────────┐
│  Action Scaling         │  action * scale + bias
└─────────────────────────┘
    │
    ▼
Output (29D)

Layer Details

Layer 1: `Linear(100 → 512)` + `LayerNorm` + `ELU`

Input: 100D normalized observation
Output: 512D features
Parameters: 51,200 (weights) + 512 (bias) + 1,024 (LayerNorm) = 52,736

Layer 2: `Linear(512 → 256)` + `LayerNorm` + `ELU`

Input: 512D features
Output: 256D features
Parameters: 131,072 (weights) + 256 (bias) + 512 (LayerNorm) = 131,840

Layer 3: `Linear(256 → 128)` + `LayerNorm` + `ELU`

Input: 256D features
Output: 128D features
Parameters: 32,768 (weights) + 128 (bias) + 256 (LayerNorm) = 33,152

Layer 4: `Linear(128 → 29)` (Action Head)

Input: 128D features
Output: 29D raw actions
Parameters: 3,712 (weights) + 29 (bias) = 3,741

Weight Summary

Weight Name	Shape	Parameters
`obs_normalizer._mean`	[1, 100]	100
`obs_normalizer._std`	[1, 100]	100
`actor.net.0.weight`	[512, 100]	51,200
`actor.net.0.bias`	[512]	512
`actor.net.1.weight` (LN)	[512]	512
`actor.net.1.bias` (LN)	[512]	512
`actor.net.3.weight`	[256, 512]	131,072
`actor.net.3.bias`	[256]	256
`actor.net.4.weight` (LN)	[256]	256
`actor.net.4.bias` (LN)	[256]	256
`actor.net.6.weight`	[128, 256]	32,768
`actor.net.6.bias`	[128]	128
`actor.net.7.weight` (LN)	[128]	128
`actor.net.7.bias` (LN)	[128]	128
`actor.fc_mu.0.weight`	[29, 128]	3,712
`actor.fc_mu.0.bias`	[29]	29
`actor.action_scale`	[29]	29
`actor.action_bias`	[29]	29

Total: 221,727 parameters

Forward Pass Data Flow

Complete Forward Pass

def forward(obs_100d):
    """
    Args:
        obs_100d: (batch, 100) observation vector
        
    Returns:
        actions: (batch, 29) joint position deltas
    """
    # 1. Normalize observation (baked into ONNX)
    x = (obs_100d - obs_mean) / obs_std
    
    # 2. Hidden layer 1
    x = linear_0(x)      # (batch, 512)
    x = layer_norm_1(x)
    x = elu(x)
    
    # 3. Hidden layer 2
    x = linear_3(x)      # (batch, 256)
    x = layer_norm_4(x)
    x = elu(x)
    
    # 4. Hidden layer 3
    x = linear_6(x)      # (batch, 128)
    x = layer_norm_7(x)
    x = elu(x)
    
    # 5. Action head
    x = fc_mu(x)         # (batch, 29)
    x = tanh(x)          # Bound to [-1, 1]
    
    # 6. Scale actions (baked into ONNX)
    actions = x * action_scale + action_bias
    
    return actions

Computational Cost

Forward pass operations:
1. Normalization: 100D subtract + divide     (~200 ops)
2. Layer 1: 100→512 matmul + LN + ELU        (~52K MACs)
3. Layer 2: 512→256 matmul + LN + ELU        (~131K MACs)
4. Layer 3: 256→128 matmul + LN + ELU        (~33K MACs)
5. Action head: 128→29 matmul + tanh + scale (~4K MACs)

Total: ~220K multiply-accumulate operations per inference

PD Gains

PD gains are stored in ONNX metadata and loaded at runtime.

PD Gain Selection Methodology

Holosoma uses the Raibert-style rule (BeyondMimic method) to compute PD gains from joint armature values:

Kp = I × (2π × fn)²
Kd = 2 × I × ζ × (2π × fn)

Where:

I = Joint armature (rotational inertia, from URDF/MuJoCo)
fn = Natural frequency = 10 Hz
ζ = Damping ratio = 2.0 (over-damped for stability)

This physics-based approach ensures each joint's stiffness/damping is proportional to its inertia.

Verification

Joint	Armature (I)	Formula Kp	Actual Kp	Match
hip_yaw	0.01018	40.179	40.179	✓
hip_roll	0.02510	99.098	99.098	✓
ankle	0.00722	28.501	28.501	✓
shoulder	0.00361	14.251	14.251	✓
wrist	0.00425	16.778	16.778	✓

Alternative Rule of Thumb (Kyle's Method)

Another common approach in RL locomotion:

Kp = τ_max / RoM   (or τ_max / (RoM/2) for stiffer response)
Kd = Kp / 20

Where τ_max is max torque and RoM is range of motion. This gives similar ballpark values but doesn't account for joint inertia.

Kp (Position Gains)

Joint Group	Kp Values	Armature
Hip pitch/yaw	40.18	0.0102
Hip roll, Knee	99.10	0.0251
Ankle	28.50	0.0072
Waist yaw	40.18	0.0102
Waist roll/pitch	28.50	0.0072
Shoulder/Elbow	14.25	0.0036
Wrist	16.78	0.0043

Kd (Velocity Gains)

Joint Group	Kd Values	Armature
Hip pitch/yaw	2.56	0.0102
Hip roll, Knee	6.31	0.0251
Ankle	1.81	0.0072
Waist yaw	2.56	0.0102
Waist roll/pitch	1.81	0.0072
Shoulder/Elbow	0.91	0.0036
Wrist	1.07	0.0043

References

Kyle's Rule of Thumb - Kp = τ_max / RoM
Whole-Body Humanoid Robot Locomotion with Human Reference (Section III.C) - BeyondMimic critical damping method

Comparison with GR00T

Feature	Holosoma FastSAC	GR00T
Architecture	Single unified actor	Dual policy (Balance + Walk)
Observation	100D (single frame)	516D (6 frames × 86D)
Actions	29D (full body)	15D (legs + waist only)
Parameters	221,727	470,578
Policy Switching	None	Based on command magnitude
Temporal Context	Gait phase encoding	6-frame history stack
Normalization	LayerNorm + obs normalizer	L2 norm on latent
Control Frequency	50 Hz	50 Hz

Key Differences

Single vs Dual Policy: Holosoma handles balance and walking with one network; GR00T switches between dedicated balance/walk policies.
Temporal Handling: GR00T stacks 6 frames (516D) for temporal context; Holosoma uses gait phase signals (sin/cos) to encode periodicity.
Body Coverage: Holosoma controls all 29 DOF including arms; GR00T only controls 15 DOF (legs + waist), leaving arms for other tasks.
Architecture: GR00T uses an encoder-decoder (estimator → actor) design with latent normalization; Holosoma is a straightforward MLP with LayerNorm.

Usage

from huggingface_hub import hf_hub_download
import onnxruntime as ort
import numpy as np

# Load model
path = hf_hub_download(repo_id="nepyope/holosoma_locomotion", 
                       filename="fastsac_g1_29dof.onnx")
policy = ort.InferenceSession(path)

# Build 100D observation
obs = np.zeros((1, 100), dtype=np.float32)
obs[0, 0:29] = last_action
obs[0, 29:32] = ang_vel * 0.25
obs[0, 32] = cmd_yaw
obs[0, 33:35] = cmd_lin
obs[0, 35:37] = np.cos(phase)
obs[0, 37:66] = (joint_pos - default_pos) * 1.0
obs[0, 66:95] = joint_vel * 0.05
obs[0, 95:98] = gravity
obs[0, 98:100] = np.sin(phase)

# Run inference
action = policy.run(None, {"actor_obs": obs})[0].squeeze()

# Apply to robot
target = default_pos + action * 0.25

References

Holosoma Project: Whole-body locomotion for humanoid robots
SAC Algorithm: "Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning"
Unitree G1: https://www.unitree.com/g1

Last Updated: 2025-12-17

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

nepyope
/

holosoma_locomotion

Holosoma FastSAC Locomotion Policy Architecture

Table of Contents

Overview

Key Features

Input Space (Observation)

Observation Vector (100D)

Observation Scaling Constants

Input Normalization

Default Joint Positions (29D Standing Pose)

Gait Phase

Output Space (Action)

Action Vector (29D)

Action Scaling & Application

Command Ranges

Network Architecture

Overview Diagram

Layer Details

Layer 1: `Linear(100 → 512)` + `LayerNorm` + `ELU`

Layer 2: `Linear(512 → 256)` + `LayerNorm` + `ELU`

Layer 3: `Linear(256 → 128)` + `LayerNorm` + `ELU`

Layer 4: `Linear(128 → 29)` (Action Head)

Weight Summary

Forward Pass Data Flow

Complete Forward Pass

Computational Cost

PD Gains

PD Gain Selection Methodology

Verification

Alternative Rule of Thumb (Kyle's Method)

Kp (Position Gains)

Kd (Velocity Gains)

References

Comparison with GR00T

Key Differences

Usage

References

Holosoma FastSAC Locomotion Policy Architecture

Table of Contents

Overview

Key Features

Input Space (Observation)

Observation Vector (100D)

Observation Scaling Constants

Input Normalization

Default Joint Positions (29D Standing Pose)

Gait Phase

Output Space (Action)

Action Vector (29D)

Action Scaling & Application

Command Ranges

Network Architecture

Overview Diagram

Layer Details

Layer 1: Linear(100 → 512) + LayerNorm + ELU

Layer 2: Linear(512 → 256) + LayerNorm + ELU

Layer 3: Linear(256 → 128) + LayerNorm + ELU

Layer 4: Linear(128 → 29) (Action Head)

Weight Summary

Forward Pass Data Flow

Complete Forward Pass

Computational Cost

PD Gains

PD Gain Selection Methodology

Verification

Alternative Rule of Thumb (Kyle's Method)

Kp (Position Gains)

Kd (Velocity Gains)

References

Comparison with GR00T

Key Differences

Usage

References

Layer 1: `Linear(100 → 512)` + `LayerNorm` + `ELU`

Layer 2: `Linear(512 → 256)` + `LayerNorm` + `ELU`

Layer 3: `Linear(256 → 128)` + `LayerNorm` + `ELU`

Layer 4: `Linear(128 → 29)` (Action Head)