UnitreeG1_ethernetCable_2000step — LingBot-VA G1 post-trained transformer

Fine-tuned transformer for LingBot-VA on Unitree G1 (Dex1), task XiaoweiLinXL/unitree_insert_the_ethernet_cable_to_the_tv_box: "Insert the ethernet cable into the tv box."

Base: robbyant/lingbot-va-base
Post-training: 69 demos, single-task (cable insertion), lr 1e-5, FDM v2 recipe — mutually-exclusive per-microstep regime (rank-synced coin fdm_prob=0.5: FDM video-only L_fdm Eq.13 lambda_fdm=1.0 OR standard IDM L_dyn+L_inv; one forward, one backward). Per-step randomized chunk_size ∈ {1,2,3,4} and window_size ∈ {4..64}.
4 GPUs × grad_accum=4 = effective batch 16, optimizer step 2000 (final of a 2000-step schedule).
Final losses: video=0.088, action=0.0016, fdm=0.085, grad_norm=0.036 — healthier loss level than the put_away_tools v21 5k run (which had suspiciously low video=0.0075, indicating overfit on a compressed distribution).
This repo contains only transformer/ — vae/, text_encoder/, tokenizer/ are unchanged from robbyant/lingbot-va-base.

⚠️ Quantile normalization warning

This checkpoint was trained under quantile (q01/q99) normalization. Smoke testing at encode time showed normalized action absmax = 2.77 for ep0, well above the model's bounded prediction range. The same failure mode hurt put_away_tools v21 deployment — predictions under-shoot the precise final-approach moments. For an insertion task this is especially risky.

If deployment performance is weak: re-encode the norm_stat with **min/max

zero-inclusion** (see scripts/compute_g1_norm_stats.py extended with the zero-inclusion logic from compute_ur3_bimanual_norm_stats.py) and retrain. The fix took ~36 h on 8 GPUs for put_away_tools v21.

Assemble an eval-ready checkpoint

hf download robbyant/lingbot-va-base                          --local-dir lingbot-va-base
hf download EmbodyX/UnitreeG1_ethernetCable_2000step           --local-dir g1_eth_2000_dl

mkdir -p g1_eth_2000
ln -sf $(realpath g1_eth_2000_dl/transformer)   g1_eth_2000/transformer
ln -sf $(realpath lingbot-va-base/vae)          g1_eth_2000/vae
ln -sf $(realpath lingbot-va-base/text_encoder) g1_eth_2000/text_encoder
ln -sf $(realpath lingbot-va-base/tokenizer)    g1_eth_2000/tokenizer

Serve with CONFIG_NAME=g1_ethernet_cable MODEL_PATH=g1_eth_2000. transformer/config.json has attn_mode: torch (inference-ready).

Downloads last month: -

Video Preview

Robotics