UnitreeG1_putawaytoolsV2_minmax_2000step — LingBot-VA G1 post-trained transformer

Fine-tuned transformer for LingBot-VA on Unitree G1 (Dex1), task XiaoweiLinXL/pi05-unitree-g1-put-away-tools-v2.1: "Put the battery on the shelf labeled 'battery' and put the screwdriver on the shelf labeled 'Philips'."

Same data, same recipe as the rndchnk series — only difference: action normalization is MIN/MAX (not q01/q99 quantile). See "Why min/max" below.

Base: robbyant/lingbot-va-base
Post-training: 70 demos (43,851 frames), lr 1e-5, FDM v2 recipe — mutually-exclusive per-microstep regime (fdm_prob=0.5, lambda_fdm=1.0). Per-step randomized chunk_size ∈ {1..4} and window_size ∈ {4..64}.
4 GPUs × grad_accum=4 = effective batch 16, optimizer step 2000 of a 5000-step schedule (mid-training; the _500step ckpt deployed weakly so this checkpoint exists for the next deployment test).
Action normalization: dataset min/max — every training target bounded strictly to [-1, +1]. (Codebase variable names are still q01/q99 because that's all the loader supports; the values stored there are min/max — drop-in replacement.)
This repo contains only transformer/ — vae/, text_encoder/, tokenizer/ are unchanged from robbyant/lingbot-va-base.

Why min/max (the v21 quantile series underperformed)

The earlier v21 5k training under quantile normalization had its right-arm joints overflow: R-wrist-roll absmax was 4.11, R-shoulder-roll 3.55, R-wrist-yaw 3.55. The model's bounded prediction range ([~-1.5, ~+1.5]) cannot match those targets → during deployment the model under-predicts the precise reach-extension moments → arm under-extends → misses the shelves. Min/max normalization bounds every target to ±1 (verified absmax = 1.0000 over all 43,851 training rows), eliminating out-of-range targets and restoring deployment quality.

Assemble an eval-ready checkpoint

hf download robbyant/lingbot-va-base                              --local-dir lingbot-va-base
hf download EmbodyX/UnitreeG1_putawaytoolsV2_minmax_2000step       --local-dir g1_pat_v2_mm_2000_dl

mkdir -p g1_pat_v2_mm_2000
ln -sf $(realpath g1_pat_v2_mm_2000_dl/transformer)  g1_pat_v2_mm_2000/transformer
ln -sf $(realpath lingbot-va-base/vae)               g1_pat_v2_mm_2000/vae
ln -sf $(realpath lingbot-va-base/text_encoder)      g1_pat_v2_mm_2000/text_encoder
ln -sf $(realpath lingbot-va-base/tokenizer)         g1_pat_v2_mm_2000/tokenizer

Serve with CONFIG_NAME=g1_putawaytools_v21 MODEL_PATH=g1_pat_v2_mm_2000. transformer/config.json has attn_mode: torch (inference-ready).

IMPORTANT — config must match training: the inference config's norm_stat must contain the same MIN/MAX values used during training (NOT the original quantile values). The va_g1_putawaytools_v21_cfg.py in the lingbot-va repo has been updated in lockstep — using the original quantile config at inference with this checkpoint would denormalize wrong. Quick check: grep "1.178246855736" wan_va/configs/va_g1_putawaytools_v21_cfg.py should return a hit.

Downloads last month: -

Video Preview

Robotics