README.md · EXOKERN/skill-forge-peginsert-v0.1.1 at main

skill-forge-peginsert-v0.1.1 / README.md

EXOKERN1

Update README card

1601fe9 verified 1 day ago

preview code

raw

history blame contribute delete

8.81 kB

	---
	pretty_name: "EXOKERN Skill v0.1.1 - Robust Peg Insertion Under Domain Randomization"
	license: cc-by-nc-4.0
	pipeline_tag: robotics
	library_name: pytorch
	tags:
	- robotics
	- diffusion-policy
	- force-torque
	- contact-rich
	- manipulation
	- insertion
	- domain-randomization
	- sim-to-real
	- isaac-lab
	- franka
	- physical-ai
	- lerobot
	datasets:
	- EXOKERN/contactbench-forge-peginsert-v0.1.1
	metrics:
	- success_rate
	- avg_contact_force_n
	- peak_contact_force_n
	model-index:
	- name: EXOKERN Skill v0.1.1 - Peg Insertion (full_ft)
	results:
	- task:
	type: robotics
	name: Peg insertion
	dataset:
	name: EXOKERN ContactBench v0.1.1
	type: EXOKERN/contactbench-forge-peginsert-v0.1.1
	metrics:
	- type: success_rate
	value: 100.0
	name: Success Rate (%)
	- type: avg_contact_force_n
	value: 3.67
	name: Average Contact Force (N)
	- type: peak_contact_force_n
	value: 10.64
	name: Peak Contact Force (N)
	---

	# EXOKERN Skill v0.1.1 - Robust Peg Insertion Under Domain Randomization

	`skill-forge-peginsert-v0.1.1` is the domain-randomized reference model release in the EXOKERN catalog. It is trained on [EXOKERN ContactBench v0.1.1](https://huggingface.co/datasets/EXOKERN/contactbench-forge-peginsert-v0.1.1) and ships the same paired comparison structure as v0:

	- `full_ft_best_model.pt`: primary checkpoint with 22D observations, including force/torque input
	- `no_ft_best_model.pt`: ablation checkpoint with the same architecture and 16D state-only observations

	This release should be read as a robustness benchmark first. Both policies remain successful under severe domain randomization, and the repo is valuable precisely because it makes the mixed result on force reduction explicit.

	## Quick Facts

	\| Item \| Value \|
	\| --- \| --- \|
	\| Task \| Peg insertion in simulation under domain randomization \|
	\| Dataset \| [EXOKERN/contactbench-forge-peginsert-v0.1.1](https://huggingface.co/datasets/EXOKERN/contactbench-forge-peginsert-v0.1.1) \|
	\| Simulator \| NVIDIA Isaac Lab (Isaac Sim 4.5) \|
	\| Robot \| Franka FR3 \|
	\| Architecture \| TemporalUNet1D diffusion policy \|
	\| Parameters \| 71.3M \|
	\| Observation horizon \| 10 frames \|
	\| Prediction / execution horizon \| 16 / 8 actions \|
	\| Seeds evaluated \| 42, 123, 7 \|
	\| Total rollouts reported \| 600 \|

	## Benchmark Summary

	The Hub metadata for this repo tracks the primary `full_ft` checkpoint. The full repo includes the paired `no_ft` ablation for comparison.

	\| Checkpoint \| Success Rate \| Avg Contact Force (N) \| Peak Contact Force (N) \| Avg Episode Time (s) \|
	\| --- \| ---: \| ---: \| ---: \| ---: \|
	\| `full_ft` \| 100.0 \| 3.67 +/- 0.45 \| 10.63 \| 25.63 \|
	\| `no_ft` \| 100.0 \| 3.37 +/- 0.06 \| 10.33 \| 25.73 \|

	![EXOKERN skill v0.1.1 benchmark summary](https://huggingface.co/EXOKERN/skill-forge-peginsert-v0.1.1/resolve/main/figures/benchmark_summary.png)

	Figure: multi-seed benchmark summary built from the published `eval_seed42/123/7.json` artifacts.

	Per-seed results:

	\| Seed \| Condition \| Success Rate \| Avg Force (N) \| Peak Force (N) \| Avg Time (s) \|
	\| --- \| --- \| ---: \| ---: \| ---: \| ---: \|
	\| 42 \| `full_ft` \| 100.0 \| 3.24 \| 10.44 \| 25.61 \|
	\| 42 \| `no_ft` \| 100.0 \| 3.38 \| 10.38 \| 25.73 \|
	\| 123 \| `full_ft` \| 100.0 \| 4.12 \| 10.57 \| 25.74 \|
	\| 123 \| `no_ft` \| 100.0 \| 3.34 \| 10.32 \| 25.79 \|
	\| 7 \| `full_ft` \| 100.0 \| 3.69 \| 10.93 \| 25.54 \|
	\| 7 \| `no_ft` \| 100.0 \| 3.37 \| 10.31 \| 25.68 \|

	Interpretation:

	- This release demonstrates robust task completion under a much harder collection regime than v0.
	- On this particular peg-in-hole setup, domain randomization largely closed the force gap between `full_ft` and `no_ft`.
	- That does not prove force/torque is unnecessary in general. It shows that this release is best used as a robust benchmark and an honest reference point for harder future tasks.

	## What Changed Compared To v0

	\| Topic \| v0 \| v0.1.1 \|
	\| --- \| --- \| --- \|
	\| Dataset regime \| Mostly fixed conditions \| Multi-layer domain randomization \|
	\| Dataset size \| 2,221 episodes / 330,929 frames \| 5,000 episodes / 745,000 frames \|
	\| Robot \| Franka Emika Panda \| Franka FR3 \|
	\| Force reduction takeaway \| Clear F/T advantage \| Inconclusive on this task \|
	\| Best use \| Clean baseline \| Robustness benchmark \|

	## Architecture

	This release uses the same 1D Temporal U-Net diffusion policy family as v0.

	![Architecture](https://huggingface.co/EXOKERN/skill-forge-peginsert-v0.1.1/resolve/main/architecture.png)

	\| Component \| Value \|
	\| --- \| --- \|
	\| Action dimension \| 7 \|
	\| Observation dimensions \| 22 (`full_ft`) / 16 (`no_ft`) \|
	\| Diffusion training steps \| 100 \|
	\| DDIM inference steps \| 16 \|
	\| Base channels \| 256 \|
	\| Channel multipliers \| (1, 2, 4) \|
	\| Normalization \| Min-max to `[-1, 1]` \|

	## Repository Contents

	\| File \| Description \|
	\| --- \| --- \|
	\| `full_ft_best_model.pt` \| Best checkpoint with force/torque input \|
	\| `no_ft_best_model.pt` \| Ablation checkpoint without force/torque input \|
	\| `inference.py` \| Self-contained inference helper and model definition \|
	\| `config.yaml` \| Training, dataset, and environment configuration \|
	\| `eval_seed42.json` \| Seed 42 evaluation artifact \|
	\| `eval_seed123.json` \| Seed 123 evaluation artifact \|
	\| `eval_seed7.json` \| Seed 7 evaluation artifact \|
	\| `training_curve_full_ft_seed42.png` \| Training curve for `full_ft`, seed 42 \|
	\| `training_curve_full_ft_seed123.png` \| Training curve for `full_ft`, seed 123 \|
	\| `training_curve_full_ft_seed7.png` \| Training curve for `full_ft`, seed 7 \|
	\| `training_curve_no_ft_seed42.png` \| Training curve for `no_ft`, seed 42 \|
	\| `training_curve_no_ft_seed123.png` \| Training curve for `no_ft`, seed 123 \|
	\| `training_curve_no_ft_seed7.png` \| Training curve for `no_ft`, seed 7 \|

	## Usage

	### Reproduce evaluation with `exokern-eval`

	```bash
	pip install exokern-eval

	wget https://huggingface.co/EXOKERN/skill-forge-peginsert-v0.1.1/resolve/main/full_ft_best_model.pt

	exokern-eval \
	--policy full_ft_best_model.pt \
	--env Isaac-Forge-PegInsert-Direct-v0 \
	--episodes 100
	```

	### Load the repo helper locally

	```python
	import os
	import sys

	from huggingface_hub import snapshot_download

	repo_dir = snapshot_download(
	repo_id="EXOKERN/skill-forge-peginsert-v0.1.1",
	allow_patterns=["*.pt", "inference.py"],
	)
	sys.path.insert(0, repo_dir)

	from inference import DiffusionPolicyInference

	policy = DiffusionPolicyInference(
	os.path.join(repo_dir, "full_ft_best_model.pt"),
	device="cpu",
	)

	policy.add_observation([0.0] * 22)
	actions = policy.get_actions()
	print(len(actions))
	```

	## Training And Evaluation Setup

	\| Item \| Value \|
	\| --- \| --- \|
	\| Train / val split \| 85% / 15% by episode \|
	\| Epochs \| 300 \|
	\| Batch size \| 256 \|
	\| Optimizer \| AdamW, `lr=1e-4`, `weight_decay=1e-4` \|
	\| LR schedule \| Cosine annealing to `1e-6` \|
	\| EMA decay \| 0.995 \|
	\| Physics rate \| 120 Hz \|
	\| Control rate \| 15 Hz \|
	\| Domain randomization \| Enabled in the training dataset \|

	## Related Work

	- FORGE: [Force-Guided Exploration for Robust Contact-Rich Manipulation under Uncertainty](https://arxiv.org/abs/2408.04587)
	- Diffusion Policy: [Visuomotor Policy Learning via Action Diffusion](https://arxiv.org/abs/2303.04137)
	- Factory: [Fast Contact for Robotic Assembly](https://arxiv.org/abs/2205.03532)

	## Citation

	```bibtex
	@misc{exokern_skill_peginsert_v011_2026,
	title = {EXOKERN Skill v0.1.1: Robust Peg Insertion Under Domain Randomization},
	author = {{EXOKERN}},
	year = {2026},
	howpublished = {\url{https://huggingface.co/EXOKERN/skill-forge-peginsert-v0.1.1}},
	note = {Paired full_ft and no_ft diffusion-policy checkpoints}
	}
	```

	## Security Note

	The checkpoints in this repo are PyTorch pickles. Load them only in a trusted or isolated environment after reviewing the repository contents.

	## Limitations

	- Simulation only. This release does not claim real-robot readiness.
	- Reported robustness is specific to the peg-in-hole task and the randomization ranges documented in the paired dataset card.
	- The ablation result is mixed: use this repo to study robustness, not to overclaim a universal force/torque effect.
	- The repo exposes paired checkpoints for research comparison; the intended production-style reference in this repo is `full_ft_best_model.pt`.

	## Related Resources

	- Dataset: [EXOKERN/contactbench-forge-peginsert-v0.1.1](https://huggingface.co/datasets/EXOKERN/contactbench-forge-peginsert-v0.1.1)
	- Baseline predecessor: [EXOKERN/skill-forge-peginsert-v0](https://huggingface.co/EXOKERN/skill-forge-peginsert-v0)
	- Evaluation CLI: [github.com/Exokern/exokern_eval](https://github.com/Exokern/exokern_eval)
	- Organization page: [huggingface.co/EXOKERN](https://huggingface.co/EXOKERN)