GR00T-N1.6-3B-PickOrange (self-trained, ckpt-6500)

针对 LeIsaac SO-101 PickOrange 任务从 nvidia/GR00T-N1.6-3B (Eagle 2.5 VLM + Cross-attention DiT action head, ~3B params) 微调的 GR00T 策略。

A NVIDIA GR00T N1.6 (Eagle 2.5 VLM + cross-attention DiT, ~3B) policy fine-tuned from nvidia/GR00T-N1.6-3B for the LeIsaac SO-101 PickOrange task.

🔗 项目仓库 / Project repos

Highlights

ckpt-6500 successful pick-and-place ckpt-6500: 3/3 oranges placed, robot returned to rest pose — env reports success

ckpt-3500 awkward early-phase failure ckpt-3500 (earlier checkpoint, kept on ckpt-3500 branch for reference): policy is still finding the placement — orange dropped off edge

TL;DR

  • Task: SO-101 single-arm picks 3 oranges sequentially and places each in a plate (LeIsaac PickOrange).
  • Architecture: GR00T N1.6 — Eagle 2.5 VLM (frozen) + cross-attention DiT action head (trainable). chunk_size=50, n_action_steps=16, 4-step rectified-flow denoising.
  • Training: 6500 step / batch=16 (per-step=2 × grad_accum=8) / adafactor / bf16 / gradient_checkpointing with use_reentrant=False.
  • Hardware: single RTX 4090 24GB (with DISABLE_ADDMM_CUDA_LT=1, watchdog auto-resume on intermittent CUDA assert).
  • 🏆 Benchmark-aligned eval (3 round × 120s sim × 180s wall_cap) vs LeIsaac leaderboard:
Model Strict rounds Oranges placed
hi-space N1.6 (公开 SOTA) 2/3 6/9
ACT 1/3 6/9
X-VLA best 0/3 4/9
🏆 This ckpt-6500 2/3 8/9

Architecture / training recipe

base_model              nvidia/GR00T-N1.6-3B
tune_llm                False
tune_visual             False
tune_projector          True
tune_diffusion_model    True
tune_top_llm_layers     4 (default, kept)
backbone_trainable_params_fp32   False     ← 4090 squeeze
optim                   adafactor          ← 4090 squeeze
gradient_checkpointing  True (use_reentrant=False, custom monkey-patch)
bf16                    True
DISABLE_ADDMM_CUDA_LT   1                  ← workaround torch 2.7.1 cublasLt bf16 bug
global_batch_size       16
gradient_accumulation_steps   8            ← per-step micro-batch = 2
max_steps               8000 (best ckpt at step 6500)
save_steps              100 (with custom keep-multiples-of-500 prune callback)

Training notes / known issues

  • 4090 24GB is the hard limit: N1.6 N1.6 全参 FT on 24GB requires every memory hack stacked: bf16 + grad-ckpt with use_reentrant=False + adafactor + backbone_trainable_params_fp32=False + DISABLE_ADDMM_CUDA_LT=1. Without any of these we hit either OOM or RuntimeError: d.is_cuda() INTERNAL ASSERT FAILED at CUDAGuardImpl.h:34.
  • Random CUDA assert still happens every ~500-700 step despite the patches. We wrap training in a watchdog that auto-resumes from the latest checkpoint after each crash; net throughput ~70% of crash-free.
  • Score variance: per-checkpoint quality oscillates wildly (e.g. ckpt-5000 = 16/18 in one 6-round eval, ckpt-5500 = 0/18 in the next). We attribute this to the optimization being run at the absolute memory edge — gradients and optim states may quantize inconsistently. The 8/9 result here is benchmark-aligned single 3-round run; expect ±20% noise on any individual run.

Inference

Use Isaac-GR00T's run_gr00t_server.py directly:

cd /path/to/Isaac-GR00T
uv run --extra=gpu python gr00t/eval/run_gr00t_server.py \
    --embodiment-tag NEW_EMBODIMENT \
    --model-path wsagi/GR00T-N1.6-PickOrange \
    --host 0.0.0.0 --port 5555

Then on the Isaac Sim eval side (LeIsaac):

POLICY_PORT=5555 \
ACTION_HORIZON=16 \
EVAL_ROUNDS=3 EPISODE_LENGTH=120 MAX_ROUND_WALL_S=180 \
PROMPT="Pick up the orange and put it in the plate" \
bash server/eval_gr00t.sh

Branches

branch step benchmark (3-round) notes
main 6500 2/3 strict, 8/9 oranges, 115s avg best
ckpt-3500 3500 0/3, 2/9, 180s first transition out of destruction phase
ckpt-5000 5000 0/3, 4/9, 180s strong 6-round (16/18) but volatile under 3-round
ckpt-7000 7000 1/3, 6/9, 146s secondary peak

License

Apache-2.0 / NVIDIA Open Model License (inherited from base nvidia/GR00T-N1.6-3B). See base model card.

Downloads last month
72
Safetensors
Model size
3B params
Tensor type
F32
·
BF16
·
Video Preview
loading

Model tree for wsagi/GR00T-N1.6-PickOrange

Finetuned
(18)
this model

Dataset used to train wsagi/GR00T-N1.6-PickOrange