g1 ethernet_cable insertion FDM-v2 transformer @ step 2000 (final, quantile norm)

4dd3e1d verified 5 days ago

2.57 kB

	---
	license: apache-2.0
	tags:
	- robotics
	- lingbot-va
	- unitree-g1
	- world-model
	---

	# UnitreeG1_ethernetCable_2000step — LingBot-VA G1 post-trained transformer

	Fine-tuned `transformer` for LingBot-VA on Unitree G1 (Dex1), task
	`XiaoweiLinXL/unitree_insert_the_ethernet_cable_to_the_tv_box`:
	"Insert the ethernet cable into the tv box."

	- Base: `robbyant/lingbot-va-base`
	- Post-training: 69 demos, single-task (cable insertion), lr 1e-5,
	FDM v2 recipe — mutually-exclusive per-microstep regime (rank-synced
	coin `fdm_prob=0.5`: FDM video-only L_fdm Eq.13 `lambda_fdm=1.0` OR
	standard IDM L_dyn+L_inv; one forward, one backward). Per-step
	randomized chunk_size ∈ {1,2,3,4} and window_size ∈ {4..64}.
	- 4 GPUs × `grad_accum=4` = effective batch 16, optimizer step 2000
	(final of a 2000-step schedule).
	- Final losses: video=0.088, action=0.0016, fdm=0.085, grad_norm=0.036
	— healthier loss level than the put_away_tools v21 5k run (which had
	suspiciously low video=0.0075, indicating overfit on a compressed
	distribution).
	- This repo contains only `transformer/` — `vae/`, `text_encoder/`,
	`tokenizer/` are unchanged from `robbyant/lingbot-va-base`.

	## ⚠️ Quantile normalization warning

	This checkpoint was trained under quantile (q01/q99) normalization.
	Smoke testing at encode time showed `normalized action absmax = 2.77` for
	ep0, well above the model's bounded prediction range. The same failure
	mode hurt `put_away_tools v21` deployment — predictions under-shoot the
	precise final-approach moments. For an insertion task this is especially
	risky.

	If deployment performance is weak: re-encode the norm_stat with **min/max
	+ zero-inclusion** (see `scripts/compute_g1_norm_stats.py` extended with
	the zero-inclusion logic from `compute_ur3_bimanual_norm_stats.py`) and
	retrain. The fix took ~36 h on 8 GPUs for put_away_tools v21.

	## Assemble an eval-ready checkpoint

	```bash
	hf download robbyant/lingbot-va-base --local-dir lingbot-va-base
	hf download EmbodyX/UnitreeG1_ethernetCable_2000step --local-dir g1_eth_2000_dl

	mkdir -p g1_eth_2000
	ln -sf $(realpath g1_eth_2000_dl/transformer) g1_eth_2000/transformer
	ln -sf $(realpath lingbot-va-base/vae) g1_eth_2000/vae
	ln -sf $(realpath lingbot-va-base/text_encoder) g1_eth_2000/text_encoder
	ln -sf $(realpath lingbot-va-base/tokenizer) g1_eth_2000/tokenizer
	```

	Serve with `CONFIG_NAME=g1_ethernet_cable MODEL_PATH=g1_eth_2000`.
	`transformer/config.json` has `attn_mode: torch` (inference-ready).