Instructions to use EmbodyX/UnitreeG1_putawaytoolsV2_minmax_2000step with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Diffusers
How to use EmbodyX/UnitreeG1_putawaytoolsV2_minmax_2000step with Diffusers:
pip install -U diffusers transformers accelerate
import torch from diffusers import DiffusionPipeline # switch to "mps" for apple devices pipe = DiffusionPipeline.from_pretrained("EmbodyX/UnitreeG1_putawaytoolsV2_minmax_2000step", dtype=torch.bfloat16, device_map="cuda") prompt = "Astronaut in a jungle, cold color palette, muted colors, detailed, 8k" image = pipe(prompt).images[0] - Notebooks
- Google Colab
- Kaggle
import torch
from diffusers import DiffusionPipeline
# switch to "mps" for apple devices
pipe = DiffusionPipeline.from_pretrained("EmbodyX/UnitreeG1_putawaytoolsV2_minmax_2000step", dtype=torch.bfloat16, device_map="cuda")
prompt = "Astronaut in a jungle, cold color palette, muted colors, detailed, 8k"
image = pipe(prompt).images[0]UnitreeG1_putawaytoolsV2_minmax_2000step β LingBot-VA G1 post-trained transformer
Fine-tuned transformer for LingBot-VA on Unitree G1 (Dex1), task
XiaoweiLinXL/pi05-unitree-g1-put-away-tools-v2.1:
"Put the battery on the shelf labeled 'battery' and put the screwdriver
on the shelf labeled 'Philips'."
Same data, same recipe as the rndchnk series β only difference: action
normalization is MIN/MAX (not q01/q99 quantile). See "Why min/max" below.
- Base:
robbyant/lingbot-va-base - Post-training: 70 demos (43,851 frames), lr 1e-5, FDM v2 recipe β
mutually-exclusive per-microstep regime (
fdm_prob=0.5,lambda_fdm=1.0). Per-step randomized chunk_size β {1..4} and window_size β {4..64}. - 4 GPUs Γ
grad_accum=4= effective batch 16, optimizer step 2000 of a 5000-step schedule (mid-training; the_500stepckpt deployed weakly so this checkpoint exists for the next deployment test). - Action normalization: dataset min/max β every training target bounded
strictly to [-1, +1]. (Codebase variable names are still
q01/q99because that's all the loader supports; the values stored there are min/max β drop-in replacement.) - This repo contains only
transformer/βvae/,text_encoder/,tokenizer/are unchanged fromrobbyant/lingbot-va-base.
Why min/max (the v21 quantile series underperformed)
The earlier v21 5k training under quantile normalization had its right-arm
joints overflow: R-wrist-roll absmax was 4.11, R-shoulder-roll 3.55,
R-wrist-yaw 3.55. The model's bounded prediction range
([~-1.5, ~+1.5]) cannot match those targets β during deployment the model
under-predicts the precise reach-extension moments β arm under-extends β
misses the shelves. Min/max normalization bounds every target to Β±1
(verified absmax = 1.0000 over all 43,851 training rows), eliminating
out-of-range targets and restoring deployment quality.
Assemble an eval-ready checkpoint
hf download robbyant/lingbot-va-base --local-dir lingbot-va-base
hf download EmbodyX/UnitreeG1_putawaytoolsV2_minmax_2000step --local-dir g1_pat_v2_mm_2000_dl
mkdir -p g1_pat_v2_mm_2000
ln -sf $(realpath g1_pat_v2_mm_2000_dl/transformer) g1_pat_v2_mm_2000/transformer
ln -sf $(realpath lingbot-va-base/vae) g1_pat_v2_mm_2000/vae
ln -sf $(realpath lingbot-va-base/text_encoder) g1_pat_v2_mm_2000/text_encoder
ln -sf $(realpath lingbot-va-base/tokenizer) g1_pat_v2_mm_2000/tokenizer
Serve with CONFIG_NAME=g1_putawaytools_v21 MODEL_PATH=g1_pat_v2_mm_2000.
transformer/config.json has attn_mode: torch (inference-ready).
IMPORTANT β config must match training: the inference config's
norm_stat must contain the same MIN/MAX values used during training
(NOT the original quantile values). The va_g1_putawaytools_v21_cfg.py
in the lingbot-va repo has been updated in lockstep β using the original
quantile config at inference with this checkpoint would denormalize wrong.
Quick check: grep "1.178246855736" wan_va/configs/va_g1_putawaytools_v21_cfg.py
should return a hit.
- Downloads last month
- -