pravsels
/

dit_block_tower_norm_fix

Model card Files Files and versions

dit_block_tower_norm_fix / README.md

pravsels's picture

Upload README.md with huggingface_hub

338fe86 verified about 1 month ago

|

history blame contribute delete

2.03 kB

	# dit_block_tower_norm_fix

	Diffusion policy checkpoint for the build_block_tower task (with DAgger rounds 1.0.0–1.4.0), trained with per-timestep (H,D) RAMEN action normalization and corrected action chunk semantics (slot 0 = current action).

	## Checkpoint

	\| Step \| train_loss \| Status \| Hash \|
	\|-------\|-----------\|--------\|------\|
	\| 29000 \| 0.0047 \| partial (29k/50k) \| `8843965c8dcf0fc68b71784fe0875b7e43eb25aa4e05b59bf796b2280b50ea96` \|

	Training was interrupted by walltime (1 day limit) at step ~29588/50000. Loss was still decreasing healthily — this checkpoint will be resumed.

	## Config

	- dataset: `villekuosmanen/build_block_tower` + DAgger rounds 1.0.0–1.4.0
	- batch_size: 80 per GPU (320 global, 4x GPUs)
	- optimizer_lr: 3e-4
	- lr_scheduler: cosine (warmup 500 steps, min_lr_scale 0.1)
	- horizon: 32, n_action_steps: 32
	- noise_scheduler: DDIM, 100 train timesteps, 20 inference steps
	- observation_encoder: CLIP ViT-B/16 (vision + text)
	- action normalization: RAMEN with per-timestep stats (H=32, D=17)

	## Files

	```
	checkpoints/29000/params/model.safetensors # model weights
	checkpoints/29000/params/config.json # model config
	assets/ramen_stats.json # action normalization stats
	TRAINING_LOG.md # sanitized training log
	```

	## Verify integrity

	```bash
	cd checkpoints/29000/params
	find config.json model.safetensors -type f \| sort \| xargs sha256sum \| sha256sum
	# expected: 8843965c8dcf0fc68b71784fe0875b7e43eb25aa4e05b59bf796b2280b50ea96
	```

	Note: `ramen_stats.json` is in `assets/`, not in the params directory. The hash above covers only the params files. To reproduce the full hash including ramen_stats, download all three files into one directory and run the same command over all three.

	## W\&B

	- [Training dashboard](https://wandb.ai/pravsels/dit_block_tower_norm_fix/runs/ksuxe451)

	## Source

	- repo: [pravsels/multitask_dit_policy](https://github.com/pravsels/multitask_dit_policy) (branch `stage1-multimodal-abstraction`)