| | --- |
| | pretty_name: "EXOKERN Skill v0.1.1 - Robust Peg Insertion Under Domain Randomization" |
| | license: cc-by-nc-4.0 |
| | pipeline_tag: robotics |
| | library_name: pytorch |
| | tags: |
| | - robotics |
| | - diffusion-policy |
| | - force-torque |
| | - contact-rich |
| | - manipulation |
| | - insertion |
| | - domain-randomization |
| | - sim-to-real |
| | - isaac-lab |
| | - franka |
| | - physical-ai |
| | - lerobot |
| | datasets: |
| | - EXOKERN/contactbench-forge-peginsert-v0.1.1 |
| | metrics: |
| | - success_rate |
| | - avg_contact_force_n |
| | - peak_contact_force_n |
| | model-index: |
| | - name: EXOKERN Skill v0.1.1 - Peg Insertion (full_ft) |
| | results: |
| | - task: |
| | type: robotics |
| | name: Peg insertion |
| | dataset: |
| | name: EXOKERN ContactBench v0.1.1 |
| | type: EXOKERN/contactbench-forge-peginsert-v0.1.1 |
| | metrics: |
| | - type: success_rate |
| | value: 100.0 |
| | name: Success Rate (%) |
| | - type: avg_contact_force_n |
| | value: 3.67 |
| | name: Average Contact Force (N) |
| | - type: peak_contact_force_n |
| | value: 10.64 |
| | name: Peak Contact Force (N) |
| | --- |
| | |
| | # EXOKERN Skill v0.1.1 - Robust Peg Insertion Under Domain Randomization |
| |
|
| | `skill-forge-peginsert-v0.1.1` is the domain-randomized reference model release in the EXOKERN catalog. It is trained on [EXOKERN ContactBench v0.1.1](https://huggingface.co/datasets/EXOKERN/contactbench-forge-peginsert-v0.1.1) and ships the same paired comparison structure as v0: |
| |
|
| | - `full_ft_best_model.pt`: primary checkpoint with 22D observations, including force/torque input |
| | - `no_ft_best_model.pt`: ablation checkpoint with the same architecture and 16D state-only observations |
| |
|
| | This release should be read as a robustness benchmark first. Both policies remain successful under severe domain randomization, and the repo is valuable precisely because it makes the mixed result on force reduction explicit. |
| |
|
| | ## Quick Facts |
| |
|
| | | Item | Value | |
| | | --- | --- | |
| | | Task | Peg insertion in simulation under domain randomization | |
| | | Dataset | [EXOKERN/contactbench-forge-peginsert-v0.1.1](https://huggingface.co/datasets/EXOKERN/contactbench-forge-peginsert-v0.1.1) | |
| | | Simulator | NVIDIA Isaac Lab (Isaac Sim 4.5) | |
| | | Robot | Franka FR3 | |
| | | Architecture | TemporalUNet1D diffusion policy | |
| | | Parameters | 71.3M | |
| | | Observation horizon | 10 frames | |
| | | Prediction / execution horizon | 16 / 8 actions | |
| | | Seeds evaluated | 42, 123, 7 | |
| | | Total rollouts reported | 600 | |
| |
|
| | ## Benchmark Summary |
| |
|
| | The Hub metadata for this repo tracks the primary `full_ft` checkpoint. The full repo includes the paired `no_ft` ablation for comparison. |
| |
|
| | | Checkpoint | Success Rate | Avg Contact Force (N) | Peak Contact Force (N) | Avg Episode Time (s) | |
| | | --- | ---: | ---: | ---: | ---: | |
| | | `full_ft` | 100.0 | 3.67 +/- 0.45 | 10.63 | 25.63 | |
| | | `no_ft` | 100.0 | 3.37 +/- 0.06 | 10.33 | 25.73 | |
| |
|
| |  |
| |
|
| | *Figure: multi-seed benchmark summary built from the published `eval_seed42/123/7.json` artifacts.* |
| |
|
| | Per-seed results: |
| |
|
| | | Seed | Condition | Success Rate | Avg Force (N) | Peak Force (N) | Avg Time (s) | |
| | | --- | --- | ---: | ---: | ---: | ---: | |
| | | 42 | `full_ft` | 100.0 | 3.24 | 10.44 | 25.61 | |
| | | 42 | `no_ft` | 100.0 | 3.38 | 10.38 | 25.73 | |
| | | 123 | `full_ft` | 100.0 | 4.12 | 10.57 | 25.74 | |
| | | 123 | `no_ft` | 100.0 | 3.34 | 10.32 | 25.79 | |
| | | 7 | `full_ft` | 100.0 | 3.69 | 10.93 | 25.54 | |
| | | 7 | `no_ft` | 100.0 | 3.37 | 10.31 | 25.68 | |
| |
|
| | Interpretation: |
| |
|
| | - This release demonstrates robust task completion under a much harder collection regime than v0. |
| | - On this particular peg-in-hole setup, domain randomization largely closed the force gap between `full_ft` and `no_ft`. |
| | - That does not prove force/torque is unnecessary in general. It shows that this release is best used as a robust benchmark and an honest reference point for harder future tasks. |
| |
|
| | ## What Changed Compared To v0 |
| |
|
| | | Topic | v0 | v0.1.1 | |
| | | --- | --- | --- | |
| | | Dataset regime | Mostly fixed conditions | Multi-layer domain randomization | |
| | | Dataset size | 2,221 episodes / 330,929 frames | 5,000 episodes / 745,000 frames | |
| | | Robot | Franka Emika Panda | Franka FR3 | |
| | | Force reduction takeaway | Clear F/T advantage | Inconclusive on this task | |
| | | Best use | Clean baseline | Robustness benchmark | |
| |
|
| | ## Architecture |
| |
|
| | This release uses the same 1D Temporal U-Net diffusion policy family as v0. |
| |
|
| |  |
| |
|
| | | Component | Value | |
| | | --- | --- | |
| | | Action dimension | 7 | |
| | | Observation dimensions | 22 (`full_ft`) / 16 (`no_ft`) | |
| | | Diffusion training steps | 100 | |
| | | DDIM inference steps | 16 | |
| | | Base channels | 256 | |
| | | Channel multipliers | (1, 2, 4) | |
| | | Normalization | Min-max to `[-1, 1]` | |
| |
|
| | ## Repository Contents |
| |
|
| | | File | Description | |
| | | --- | --- | |
| | | `full_ft_best_model.pt` | Best checkpoint with force/torque input | |
| | | `no_ft_best_model.pt` | Ablation checkpoint without force/torque input | |
| | | `inference.py` | Self-contained inference helper and model definition | |
| | | `config.yaml` | Training, dataset, and environment configuration | |
| | | `eval_seed42.json` | Seed 42 evaluation artifact | |
| | | `eval_seed123.json` | Seed 123 evaluation artifact | |
| | | `eval_seed7.json` | Seed 7 evaluation artifact | |
| | | `training_curve_full_ft_seed42.png` | Training curve for `full_ft`, seed 42 | |
| | | `training_curve_full_ft_seed123.png` | Training curve for `full_ft`, seed 123 | |
| | | `training_curve_full_ft_seed7.png` | Training curve for `full_ft`, seed 7 | |
| | | `training_curve_no_ft_seed42.png` | Training curve for `no_ft`, seed 42 | |
| | | `training_curve_no_ft_seed123.png` | Training curve for `no_ft`, seed 123 | |
| | | `training_curve_no_ft_seed7.png` | Training curve for `no_ft`, seed 7 | |
| |
|
| | ## Usage |
| |
|
| | ### Reproduce evaluation with `exokern-eval` |
| |
|
| | ```bash |
| | pip install exokern-eval |
| | |
| | wget https://huggingface.co/EXOKERN/skill-forge-peginsert-v0.1.1/resolve/main/full_ft_best_model.pt |
| | |
| | exokern-eval \ |
| | --policy full_ft_best_model.pt \ |
| | --env Isaac-Forge-PegInsert-Direct-v0 \ |
| | --episodes 100 |
| | ``` |
| |
|
| | ### Load the repo helper locally |
| |
|
| | ```python |
| | import os |
| | import sys |
| | |
| | from huggingface_hub import snapshot_download |
| | |
| | repo_dir = snapshot_download( |
| | repo_id="EXOKERN/skill-forge-peginsert-v0.1.1", |
| | allow_patterns=["*.pt", "inference.py"], |
| | ) |
| | sys.path.insert(0, repo_dir) |
| | |
| | from inference import DiffusionPolicyInference |
| | |
| | policy = DiffusionPolicyInference( |
| | os.path.join(repo_dir, "full_ft_best_model.pt"), |
| | device="cpu", |
| | ) |
| | |
| | policy.add_observation([0.0] * 22) |
| | actions = policy.get_actions() |
| | print(len(actions)) |
| | ``` |
| |
|
| | ## Training And Evaluation Setup |
| |
|
| | | Item | Value | |
| | | --- | --- | |
| | | Train / val split | 85% / 15% by episode | |
| | | Epochs | 300 | |
| | | Batch size | 256 | |
| | | Optimizer | AdamW, `lr=1e-4`, `weight_decay=1e-4` | |
| | | LR schedule | Cosine annealing to `1e-6` | |
| | | EMA decay | 0.995 | |
| | | Physics rate | 120 Hz | |
| | | Control rate | 15 Hz | |
| | | Domain randomization | Enabled in the training dataset | |
| |
|
| | ## Related Work |
| |
|
| | - FORGE: [Force-Guided Exploration for Robust Contact-Rich Manipulation under Uncertainty](https://arxiv.org/abs/2408.04587) |
| | - Diffusion Policy: [Visuomotor Policy Learning via Action Diffusion](https://arxiv.org/abs/2303.04137) |
| | - Factory: [Fast Contact for Robotic Assembly](https://arxiv.org/abs/2205.03532) |
| |
|
| | ## Citation |
| |
|
| | ```bibtex |
| | @misc{exokern_skill_peginsert_v011_2026, |
| | title = {EXOKERN Skill v0.1.1: Robust Peg Insertion Under Domain Randomization}, |
| | author = {{EXOKERN}}, |
| | year = {2026}, |
| | howpublished = {\url{https://huggingface.co/EXOKERN/skill-forge-peginsert-v0.1.1}}, |
| | note = {Paired full_ft and no_ft diffusion-policy checkpoints} |
| | } |
| | ``` |
| |
|
| | ## Security Note |
| |
|
| | The checkpoints in this repo are PyTorch pickles. Load them only in a trusted or isolated environment after reviewing the repository contents. |
| |
|
| | ## Limitations |
| |
|
| | - Simulation only. This release does not claim real-robot readiness. |
| | - Reported robustness is specific to the peg-in-hole task and the randomization ranges documented in the paired dataset card. |
| | - The ablation result is mixed: use this repo to study robustness, not to overclaim a universal force/torque effect. |
| | - The repo exposes paired checkpoints for research comparison; the intended production-style reference in this repo is `full_ft_best_model.pt`. |
| |
|
| | ## Related Resources |
| |
|
| | - Dataset: [EXOKERN/contactbench-forge-peginsert-v0.1.1](https://huggingface.co/datasets/EXOKERN/contactbench-forge-peginsert-v0.1.1) |
| | - Baseline predecessor: [EXOKERN/skill-forge-peginsert-v0](https://huggingface.co/EXOKERN/skill-forge-peginsert-v0) |
| | - Evaluation CLI: [github.com/Exokern/exokern_eval](https://github.com/Exokern/exokern_eval) |
| | - Organization page: [huggingface.co/EXOKERN](https://huggingface.co/EXOKERN) |
| |
|