Instructions to use EmbodyX/UnitreeG1_ethernetCable_2000step with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Diffusers
How to use EmbodyX/UnitreeG1_ethernetCable_2000step with Diffusers:
pip install -U diffusers transformers accelerate
import torch from diffusers import DiffusionPipeline # switch to "mps" for apple devices pipe = DiffusionPipeline.from_pretrained("EmbodyX/UnitreeG1_ethernetCable_2000step", dtype=torch.bfloat16, device_map="cuda") prompt = "Astronaut in a jungle, cold color palette, muted colors, detailed, 8k" image = pipe(prompt).images[0] - Notebooks
- Google Colab
- Kaggle
| license: apache-2.0 | |
| tags: | |
| - robotics | |
| - lingbot-va | |
| - unitree-g1 | |
| - world-model | |
| # UnitreeG1_ethernetCable_2000step β LingBot-VA G1 post-trained transformer | |
| Fine-tuned `transformer` for LingBot-VA on Unitree G1 (Dex1), task | |
| `XiaoweiLinXL/unitree_insert_the_ethernet_cable_to_the_tv_box`: | |
| *"Insert the ethernet cable into the tv box."* | |
| - Base: `robbyant/lingbot-va-base` | |
| - Post-training: 69 demos, single-task (cable insertion), lr 1e-5, | |
| **FDM v2 recipe** β mutually-exclusive per-microstep regime (rank-synced | |
| coin `fdm_prob=0.5`: FDM video-only L_fdm Eq.13 `lambda_fdm=1.0` OR | |
| standard IDM L_dyn+L_inv; one forward, one backward). Per-step | |
| **randomized chunk_size β {1,2,3,4}** and **window_size β {4..64}**. | |
| - 4 GPUs Γ `grad_accum=4` = effective batch 16, optimizer **step 2000** | |
| (final of a 2000-step schedule). | |
| - Final losses: video=0.088, action=0.0016, fdm=0.085, grad_norm=0.036 | |
| β healthier loss level than the put_away_tools v21 5k run (which had | |
| suspiciously low video=0.0075, indicating overfit on a compressed | |
| distribution). | |
| - This repo contains **only `transformer/`** β `vae/`, `text_encoder/`, | |
| `tokenizer/` are unchanged from `robbyant/lingbot-va-base`. | |
| ## β οΈ Quantile normalization warning | |
| This checkpoint was trained under **quantile (q01/q99) normalization**. | |
| Smoke testing at encode time showed `normalized action absmax = 2.77` for | |
| ep0, well above the model's bounded prediction range. The same failure | |
| mode hurt `put_away_tools v21` deployment β predictions under-shoot the | |
| precise final-approach moments. For an insertion task this is especially | |
| risky. | |
| If deployment performance is weak: re-encode the norm_stat with **min/max | |
| + zero-inclusion** (see `scripts/compute_g1_norm_stats.py` extended with | |
| the zero-inclusion logic from `compute_ur3_bimanual_norm_stats.py`) and | |
| retrain. The fix took ~36 h on 8 GPUs for put_away_tools v21. | |
| ## Assemble an eval-ready checkpoint | |
| ```bash | |
| hf download robbyant/lingbot-va-base --local-dir lingbot-va-base | |
| hf download EmbodyX/UnitreeG1_ethernetCable_2000step --local-dir g1_eth_2000_dl | |
| mkdir -p g1_eth_2000 | |
| ln -sf $(realpath g1_eth_2000_dl/transformer) g1_eth_2000/transformer | |
| ln -sf $(realpath lingbot-va-base/vae) g1_eth_2000/vae | |
| ln -sf $(realpath lingbot-va-base/text_encoder) g1_eth_2000/text_encoder | |
| ln -sf $(realpath lingbot-va-base/tokenizer) g1_eth_2000/tokenizer | |
| ``` | |
| Serve with `CONFIG_NAME=g1_ethernet_cable MODEL_PATH=g1_eth_2000`. | |
| `transformer/config.json` has `attn_mode: torch` (inference-ready). | |