Update README.md

3c11f6d verified 19 days ago

5.69 kB

	---
	license: mit
	task_categories:
	- robotics
	tags:
	- robotics
	- diffusion-policy
	- imitation-learning
	- kinder
	- mujoco
	library_name: kinder-diffusion-policy
	---

	# KinDER — Diffusion Policy + Environment States (DPES) Checkpoints

	Trained DP + Environment States (DPES) checkpoints for the
	[KinDER](https://prpl-group.com/kinder-site/)
	physical-reasoning benchmark (RSS 2026).

	Each checkpoint is an imitation learning policy trained from **~100 human
	demonstrations** per environment.
	The training code lives in
	[kinder-diffusion-policy](https://github.com/Princeton-Robot-Planning-and-Learning/kinder-diffusion-policy)
	(a fork of [diffusion_policy](https://github.com/real-stanford/diffusion_policy)).
	Demonstrations are available at
	[kinder-bench/kinder-datasets](https://huggingface.co/datasets/kinder-bench/kinder-datasets).

	---

	## Checkpoints

	\| Path \| KinDER environment \| Trained epochs \| Final train loss \|
	\|------\|--------------------\|---------------\|-----------------\|
	\| `motion2d/epoch=1000-train_loss=0.000.ckpt` \| `Motion2D-p0` \| 1 000 \| 0.000 \|
	\| `stickbutton2d/epoch=2000-train_loss=0.001.ckpt` \| `StickButton2D-b1` \| 2 000 \| 0.001 \|
	\| `dynobstruction2d/epoch=2000-train_loss=0.000.ckpt` \| `DynObstruction2D-o1` \| 2 000 \| 0.000 \|
	\| `dynpushpullhook2d/epoch=0900-train_loss=0.001.ckpt` \| `DynPushPullHook2D-o5` \| 900 \| 0.001 \|
	\| `basemotion3d/epoch=2000-train_loss=0.000.ckpt` \| `BaseMotion3D` \| 2 000 \| 0.000 \|
	\| `shelf3d/epoch=0300-train_loss=0.000.ckpt` \| `Shelf3D` \| 300 \| 0.000 \|
	\| `sweep3d/epoch=0300-train_loss=0.001.ckpt` \| `SweepIntoDrawer3D` \| 300 \| 0.001 \|
	\| `transport3d/epoch=0100-train_loss=0.000.ckpt` \| `Transport3D-o2` \| 100 \| 0.000 \|

	---

	## Method

	DP + Environment States (DPES) extends standard Diffusion Policy by
	incorporating additional low-level environment state vectors as input alongside
	RGB images. The environment states are encoded with MLPs before being fused
	with the image features and passed to the diffusion model.

	### Comparison with Diffusion Policy (DP)

	\| \| DP \| DPES \|
	\|-\|----\|------\|
	\| RGB image input \| ✓ \| ✓ \|
	\| Environment state input \| — \| ✓ (MLP-encoded) \|
	\| Robot state input \| — \| ✓ (MLP-encoded) \|
	\| Action output \| diffusion \| diffusion \|

	### Inputs

	2D environments — single overhead RGB image (224 × 224) + flat state vector:

	\| State vector \| Content \|
	\|-------------\|---------\|
	\| `robot_state` \| Robot proprioception (joint positions, velocities) \|
	\| `env_state` \| Object / scene state (positions, orientations of all relevant entities) \|

	3D TidyBot environments — three RGB cameras (base 224 × 224, wrist 224 × 224,
	overview 224 × 224) + flat state vectors as above.

	The environment and robot state vectors are each passed through a small MLP
	encoder before concatenation with the visual features, giving the policy direct
	access to precise geometric information that may be hard to extract from pixels
	alone.

	### Output

	Action chunk predicted by iterative DDPM denoising, identical to DP.

	---

	## Usage

	### 1. Install dependencies

	```bash
	# Clone and set up kinder-diffusion-policy
	git clone git@github.com:Princeton-Robot-Planning-and-Learning/kinder-diffusion-policy.git
	cd kinder-diffusion-policy
	# Follow the environment setup instructions in the repo README
	mamba activate robodiff
	```

	```bash
	# Install the kinder-imitation-learning inference utilities
	cd kinder-baselines/kinder-imitation-learning
	uv pip install -r prpl_requirements.txt
	uv pip install -e ".[develop]"
	```

	### 2. Launch the policy server

	```bash
	cd ~/kinder-diffusion-policy
	mamba activate robodiff
	python policy_server.py --ckpt-path /path/to/sweep3d/epoch=0300-train_loss=0.001.ckpt
	```

	### 3. Run evaluation

	```bash
	cd kinder-baselines/kinder-models/scripts
	python inference.py \
	--env-name kinder/SweepIntoDrawer3D-o5-v0 \
	--save-videos \
	--num-seeds 1 \
	--num-episodes 5 \
	--max-steps 200
	```

	Replace `--env-name` and `--ckpt-path` with the environment and checkpoint of your choice.

	---

	## Training from scratch

	```bash
	# Convert raw teleoperation recordings to HDF5
	cd kinder-baselines/kinder-models/scripts
	python demos_to_hdf5.py \
	--teleop_data_dir $YOUR_DATA_DIR \
	--output_path $OUTPUT_HDF5_PATH \
	--render_images

	# Train with the DPES config (includes state inputs)
	cd ~/kinder-diffusion-policy
	mamba activate robodiff
	python train.py --config-name=train_sweep3d_image_state
	```

	---

	## Related resources

	\| Resource \| Link \|
	\|----------\|------\|
	\| KinDER benchmark \| [kindergarden](https://github.com/Princeton-Robot-Planning-and-Learning/kindergarden) \|
	\| Training code \| [kinder-diffusion-policy](https://github.com/Princeton-Robot-Planning-and-Learning/kinder-diffusion-policy) \|
	\| Demonstration datasets \| [kinder-bench/kinder-datasets](https://huggingface.co/datasets/kinder-bench/kinder-datasets) \|
	\| DP (image-only) checkpoints \| kinder-bench/kinder-DP-checkpoints \|
	\| Finetuned π0.5 VLA checkpoints \| [kinder-openpi](https://github.com/Princeton-Robot-Planning-and-Learning/kinder-openpi) \|

	---

	## Citation

	If you use these datasets, please cite the paper: [KinDER: A Physical Reasoning Benchmark for Robot Learning and Planning](https://huggingface.co/papers/2604.25788):

	```bibtex
	@inproceedings{huang2026kinder,
	title = {KinDER: A Physical Reasoning Benchmark for Robot Learning and Planning},
	author = {Huang, Yixuan and Li, Bowen and Saxena, Vaibhav and Liang, Yichao and Mishra, Utkarsh and Ji, Liang and Zha, Lihan and Wu, Jimmy and Kumar, Nishanth and Scherer, Sebastian and Xu, Danfei and Silver, Tom},
	booktitle = {Robotics: Science and Systems (RSS)},
	year = {2026}
	}
	```