RDT-1B fine-tuned on genesis-hr-bench (step 200,000)

DeepSpeed-ZeRO fine-tune of robotics-diffusion-transformer/rdt-1b on the zhouqh/hrbench genesis-hr-bench dataset, converted to RDT's HDF5 schema (single Franka, 8-D state/action, 2 cameras).

Training


Base model	`robotics-diffusion-transformer/rdt-1b` (RDT-1B, ~1.2B params)
Dataset	`zhouqh/hrbench` → RDT-HDF5 (8-D state, `cam_high` + `cam_right_wrist`, `instruction.json`)
Internal step (rdt)	200,000 (final, on-disk `checkpoint-200000/`)
Optimizer step (tqdm/wandb)	89,105 (cross-reference for the wandb loss curve)
Per-GPU batch	16
GPUs	8 × H200
Effective batch	128 (no grad accumulation)
Optimizer	AdamW via DeepSpeed ZeRO-2, lr 1e-4
Precision	bfloat16
EMA	enabled (max_value=0.9999, power=0.75)
Hardware	1 node, FAIR Cloud (h200 partition), slurm job `1371522`
Wandb run	`warm-puddle-8` / `1v8j3fur`

Note on step numbers: rdt has two counters — the internal one used in the on-disk dir name (checkpoint-200000) and the tqdm/optimizer step shown on the wandb x-axis (89,105). They differ because rdt's training loop counts data-iterator iterations rather than optimizer updates. The HF repo is named by the on-disk counter so it matches the artifact you'd see if you replicated locally.

The full RDT policy config is in config.json (architecture: 28-layer transformer, hidden_size=2048, action_dim=128, pred_horizon=64).

Files

File	Purpose
`ema/model.safetensors`	EMA weights (~2.3 GB) — primary inference artifact, what `RDTRunner.from_pretrained()` uses by default.
`pytorch_model.bin`	Non-EMA weights (~2.3 GB) — for ablation against EMA.
`config.json`	RDT architecture config (depth, hidden_size, action_dim, noise scheduler, etc.).

The DeepSpeed ZeRO optimizer shards (~14 GB) and resume scaffolding (random_states_*.pkl, scheduler.bin, latest, zero_to_fp32.py) were not uploaded — this repo is inference-only.

Usage

# Standard rdt eval path (EMA weights)
from scripts.agilex_model import create_model  # in baseline/rdt/

model = create_model(
    args=...,
    dtype=torch.bfloat16,
    pretrained="zimplex/rdt-1b-genesis-hr-bench-step200000",
    pretrained_text_encoder_name_or_path="google/t5-v1_1-xxl",
    pretrained_vision_encoder_name_or_path="google/siglip-so400m-patch14-384",
)

The genesis-hr-bench-specific dataloader (baseline/rdt/data/hdf5_vla_dataset.py) expects 8-D state, cam_high + cam_right_wrist, and instruction.json. See baseline/rdt_overrides/ for the overlay applied to the upstream submodule.

Provenance

Wandb run: https://wandb.ai/multi-agent-world-model/roboticDiffusionTransformer/runs/1v8j3fur
Slurm job: 1371522 (h200_mrs_2)
Local exp: runs/rdt/finetune/rdt_20260521/checkpoint-200000/
Training code: scripts/finetune_rdt.sh

Completes the 7-checkpoint genesis-hr-bench finetune sweep — see CHECKPOINT_SUMMARY.md.

Downloads last month: 19

Video Preview

Robotics

Model tree for zimplex/rdt-1b-genesis-hr-bench-step200000

Base model

robotics-diffusion-transformer/rdt-1b

Finetuned

(6)

this model

zimplex
/

rdt-1b-genesis-hr-bench-step200000