GR00T N1: An Open Foundation Model for Generalist Humanoid Robots
Paper โข 2503.14734 โข Published โข 8
This is a smoke-test fine-tune of nvidia/GR00T-N1.7-3B
on the cube_to_bowl_5 demo dataset bundled with Isaac-GR00T.
Validates the GR00T N1.7 post-training pipeline on a single Blackwell GPU (RTX PRO 6000) under CUDA 12.8. The 100-step run is a pipeline integrity check, not a converged model โ production fine-tunes typically run 2,000+ steps.
| Parameter | Value |
|---|---|
| Base model | nvidia/GR00T-N1.7-3B |
| Dataset | demo_data/cube_to_bowl_5 (5 episodes, ~4150 frames, SO-101 follower arm) |
| Embodiment tag | NEW_EMBODIMENT |
| Steps | 100 (MAX_STEPS=100) |
| Global batch size | 8 |
| Learning rate | 1e-4 (cosine, 5% warmup) |
| Weight decay | 1e-5 |
| Train runtime | 155 s on a single RTX PRO 6000 Blackwell |
| Loss trajectory | 1.146 โ 0.984 |
| GPU | NVIDIA RTX PRO 6000 Blackwell Server Edition (sm_120, 96 GB) |
| CUDA / driver | 12.8 / 580.126.09 |
| Trainable params | 1.62 B / 3.14 B (51.5%) |
Inference-only artifacts. The training-only optimizer.pt (~13 GB) and rng_state.pth
have been omitted to keep the repository small.
from gr00t.policy.gr00t_policy import Gr00tPolicy
from gr00t.data.embodiment_tags import EmbodimentTag
policy = Gr00tPolicy(
model_path="m3/groot-n1.7-cube-bowl-100steps",
embodiment_tag=EmbodimentTag.NEW_EMBODIMENT,
modality_config=...,
modality_transform=...,
device="cuda:0",
)
cube_to_bowl_5 dataset has only 5 episodes; the model is heavily
underfit and will not generalize beyond its training distribution.MAX_STEPS=2000+ per
the Isaac-GR00T finetune guide.The architecture is described in the GR00T N1 white paper.
Base model
nvidia/GR00T-N1.7-3B