Duplicate from AdrianLlopart/rskill-diffusion-pusht

Browse files

Files changed (5) hide show

.gitattributes +35 -0
README.md +97 -0
eval/README.md +14 -0
eval/pusht.json +90 -0
rskill.yaml +78 -0

.gitattributes ADDED Viewed

	@@ -0,0 +1,35 @@

+*.7z filter=lfs diff=lfs merge=lfs -text
+*.arrow filter=lfs diff=lfs merge=lfs -text
+*.bin filter=lfs diff=lfs merge=lfs -text
+*.bz2 filter=lfs diff=lfs merge=lfs -text
+*.ckpt filter=lfs diff=lfs merge=lfs -text
+*.ftz filter=lfs diff=lfs merge=lfs -text
+*.gz filter=lfs diff=lfs merge=lfs -text
+*.h5 filter=lfs diff=lfs merge=lfs -text
+*.joblib filter=lfs diff=lfs merge=lfs -text
+*.lfs.* filter=lfs diff=lfs merge=lfs -text
+*.mlmodel filter=lfs diff=lfs merge=lfs -text
+*.model filter=lfs diff=lfs merge=lfs -text
+*.msgpack filter=lfs diff=lfs merge=lfs -text
+*.npy filter=lfs diff=lfs merge=lfs -text
+*.npz filter=lfs diff=lfs merge=lfs -text
+*.onnx filter=lfs diff=lfs merge=lfs -text
+*.ot filter=lfs diff=lfs merge=lfs -text
+*.parquet filter=lfs diff=lfs merge=lfs -text
+*.pb filter=lfs diff=lfs merge=lfs -text
+*.pickle filter=lfs diff=lfs merge=lfs -text
+*.pkl filter=lfs diff=lfs merge=lfs -text
+*.pt filter=lfs diff=lfs merge=lfs -text
+*.pth filter=lfs diff=lfs merge=lfs -text
+*.rar filter=lfs diff=lfs merge=lfs -text
+*.safetensors filter=lfs diff=lfs merge=lfs -text
+saved_model/**/* filter=lfs diff=lfs merge=lfs -text
+*.tar.* filter=lfs diff=lfs merge=lfs -text
+*.tar filter=lfs diff=lfs merge=lfs -text
+*.tflite filter=lfs diff=lfs merge=lfs -text
+*.tgz filter=lfs diff=lfs merge=lfs -text
+*.wasm filter=lfs diff=lfs merge=lfs -text
+*.xz filter=lfs diff=lfs merge=lfs -text
+*.zip filter=lfs diff=lfs merge=lfs -text
+*.zst filter=lfs diff=lfs merge=lfs -text
+*tfevents* filter=lfs diff=lfs merge=lfs -text

README.md ADDED Viewed

	@@ -0,0 +1,97 @@

+---
+tags:
+  - OpenRAL
+  - rskill
+  - diffusion-policy
+  - lerobot
+  - pusht
+  - manipulation
+license: apache-2.0
+language:
+  - en
+---
+# rskill-diffusion-pusht
+> **OpenRAL rSkill** — Diffusion Policy (Chi et al., 2023) trained on
+> the PushT 2-D pushing benchmark, packaged for `OpenRAL`.
+This package wraps [`lerobot/diffusion_pusht`](https://huggingface.co/lerobot/diffusion_pusht)
+with a `rskill.yaml` manifest. It does **not** copy model weights.
+## Upstream model
+| Field | Value |
+| --- | --- |
+| Source repo | [`lerobot/diffusion_pusht`](https://huggingface.co/lerobot/diffusion_pusht) |
+| Paper | [arxiv:2303.04137](https://arxiv.org/abs/2303.04137) — *Diffusion Policy: Visuomotor Policy Learning via Action Diffusion* (Chi et al., 2023) |
+| License | Apache-2.0 |
+| Parameters | ~263 M (1-D U-Net) |
+| Action chunk | 8 (within horizon 16) |
+| Denoising | 100 DDPM steps per chunk |
+| Benchmark | PushT (`gym_pusht`, `pymunk` 2-D rigid-body) |
+Per-chunk inference is dominated by the 100-step denoising loop; cached
+pops are essentially free, so this is the extreme test of the
+queue-drain contract in `ChunkedExecutor`.
+## Supported robots
+| Robot | Embodiment tag | Status | Notes |
+| --- | --- | --- | --- |
+| PushT 2-D pseudo-robot (`gym_pusht/PushT-v0`) | `pusht`, `lerobot` | ✓ sim | 2-D end-effector pushing a T block on a 512 × 512 px canvas |
+## Sensors required
+| Key | Type | Resolution | Format |
+| --- | --- | --- | --- |
+| `observation.image` | RGB camera | 96 × 96 | `float32` |
+PushT predates the multi-cam `observation.images.cameraN` convention and
+exposes the raw key `observation.image`.
+## Manifest summary
+| Field | Value |
+| --- | --- |
+| `name` | `AdrianLlopart/rskill-diffusion-pusht` |
+| `version` | `0.1.0` |
+| `license` | `apache-2.0` |
+| `role` | `s1` |
+| `embodiment_tags` | `pusht`, `lerobot` |
+| `runtime` / `quantization.dtype` | `pytorch` / `fp32` |
+| `weights_uri` | `hf://lerobot/diffusion_pusht` |
+| `latency_budget.per_chunk_ms` | 1 250 ms (warm full-chunk ≈ 1 756 ms on RTX 4070 Laptop, dominated by DDPM) |
+| `latency_budget.warmup_ms` | 10 000 ms |
+| `latency_budget.load_ms` | 30 000 ms |
+| `commercial_use_allowed` | `true` |
+Full schema: `openral_core.RSkillManifest` —
+`python/core/src/openral_core/schemas.py`.
+## Reproduction
+```bash
+git clone https://github.com/AdrianLlopart/openral && cd OpenRAL
+just bootstrap && uv sync --all-packages --group sim
+# End-to-end via the canonical SimEnvironment config (CPU is enough):
+just sim-diffusion-pusht
+# which runs:
+#     ral sim run --config examples/sim/diffusion_pusht.yaml --save-video
+# Sim test (gym_pusht + pymunk):
+uv run pytest tests/sim/test_pusht_2d_diffusion_pusht.py -v -m sim
+```
+## License
+This rSkill package (`rskill.yaml`, `README.md`) is **Apache-2.0** to
+match the upstream weights. Commercial use is allowed
+(`commercial_use_allowed: true`).
+## See also
+- [`robots/pusht_2d/README.md`](../../robots/pusht_2d/README.md) — RobotDescription manifest.
+- [`examples/sim/diffusion_pusht.yaml`](../../examples/sim/diffusion_pusht.yaml) — paired SimEnvironment config.
+- [`docs/reference/vla_compatibility.md`](../../docs/reference/vla_compatibility.md) — VLA × Robot × Sim matrix.

eval/README.md ADDED Viewed

	@@ -0,0 +1,14 @@

+# `rskills/diffusion-pusht/eval/` — benchmark results
+`pusht.json` is the PushT mean-coverage-IoU benchmark result block for this
+rSkill. Validated against
+[`openral_core.RSkillEvalResult`](../../../docs/reference/schemas/RSkillEvalResult.json)
+at load time by the `rSkill` loader and surfaced by `ral benchmark report`.
+| Field | Value |
+| --- | --- |
+| Source | Chi et al., 2023 — *Diffusion Policy: Visuomotor Policy Learning via Action Diffusion* (arxiv:2303.04137) |
+| Benchmark | PushT (`gym_pusht/PushT-v0`, pymunk 2-D rigid-body) |
+| Robot | PushT 2-D pseudo-robot (single 2-D end-effector tip) |
+| Reproduced locally? | ✗ — paper-only. `tests/sim/test_pusht_2d_diffusion_pusht.py` runs a single episode for IO + latency + VRAM verification. |
+| Reproduce | `just sim-diffusion-pusht` (single episode); raise `--n-episodes 50` for the full paper protocol. |

eval/pusht.json ADDED Viewed

	@@ -0,0 +1,90 @@

+{
+  "schema_version": "1",
+  "source": {
+    "paper": "https://arxiv.org/abs/2303.04137",
+    "arxiv": "https://arxiv.org/abs/2303.04137",
+    "model_variant": "diffusion",
+    "evaluated_by": "OpenRAL:ral benchmark run",
+    "reproduced_locally": true,
+    "reproduction_planned": null,
+    "reproduction_cli": "ral benchmark run --suite pusht --rskill rskill://diffusion-pusht",
+    "table": null,
+    "status": "reproduced"
+  },
+  "benchmark": {
+    "name": "PushT (gym-pusht)",
+    "dataset": null,
+    "protocol": "50 episodes per task, success_key=is_success, max_steps=300",
+    "robot": "pusht_2d",
+    "simulator": "gym-pusht (pymunk 2-D)"
+  },
+  "eval_config": {
+    "n_episodes": 50,
+    "seeds": [
+      0,
+      1,
+      2,
+      3,
+      4,
+      5,
+      6,
+      7,
+      8,
+      9,
+      10,
+      11,
+      12,
+      13,
+      14,
+      15,
+      16,
+      17,
+      18,
+      19,
+      20,
+      21,
+      22,
+      23,
+      24,
+      25,
+      26,
+      27,
+      28,
+      29,
+      30,
+      31,
+      32,
+      33,
+      34,
+      35,
+      36,
+      37,
+      38,
+      39,
+      40,
+      41,
+      42,
+      43,
+      44,
+      45,
+      46,
+      47,
+      48,
+      49
+    ],
+    "success_key": "is_success",
+    "max_steps": 300,
+    "vla_id": "diffusion",
+    "weights_uri": "rskill://rskills/diffusion-pusht"
+  },
+  "results": {
+    "pusht/0_success_rate": 0.6,
+    "avg_success_rate": 0.6,
+    "n_tasks": 1,
+    "n_episodes_per_task": 50,
+    "n_episodes_total": 50,
+    "mean_step_latency_ms_avg": 232.5852261891309,
+    "mean_coverage_iou": 0.9496237652727986
+  },
+  "baselines": {}
+}

rskill.yaml ADDED Viewed

	@@ -0,0 +1,78 @@

+# rSkill manifest — OpenRAL packaging format V1 (CLAUDE.md §6.4)
+# Wraps: lerobot/diffusion_pusht (Apache-2.0)
+# Paper: Chi et al., 2023 — Diffusion Policy.
+schema_version: "1"
+name: "AdrianLlopart/rskill-diffusion-pusht"
+version: "0.1.0"
+license: "apache-2.0"
+role: "s1"
+model_family: "diffusion"
+# 2-D PushT pseudo-robot (single end-effector pushing a T block). Used by
+# tests/sim/test_pusht_2d_diffusion_pusht.py against gym_pusht/PushT-v0.
+embodiment_tags:
+  - "pusht"
+capabilities_required: {}
+# PushT exposes a single 96×96 RGB top-down stream (named
+# observation.image, not images.cameraN — PushT predates the multi-cam
+# convention used by SmolVLA/ACT).
+sensors_required:
+  - modality: "rgb"
+    vla_feature_key: "observation.image"
+    min_width: 96
+    min_height: 96
+# Output side (ADR-0013). The pusht_2d scene-pseudo-robot exposes a 2-D
+# (x, y) absolute position; robots/pusht_2d/robot.yaml advertises
+# `cartesian_pose` as its supported control mode (the codebase
+# convention for the PushT 2-D action regardless of dimensionality).
+# The loader auto-fills n_dof (2) + vla_action_key from the robot YAML.
+actuators_required:
+  - kind: "cartesian_pose"
+runtime: "pytorch"
+quantization:
+  dtype: "fp32"
+  backend: "pytorch"
+weights_uri: "hf://lerobot/diffusion_pusht"
+chunk_size: 8
+latency_budget:
+  # Reference-host measurement (RTX 4070 Laptop, CUDA 12.8, PyTorch 2.10)
+  # of the warm full-chunk inference is 1756 ms — Diffusion Policy runs
+  # 100 DDPM denoising steps per chunk, the dominant cost in the suite.
+  # Pinning per_chunk_ms to 1250 ms with tolerance_pct=100 yields the
+  # previous 2.5 s ceiling (_WARM_CHUNK_CEILING_S in the sim test).
+  per_chunk_ms: 1250.0
+  warmup_ms: 10000.0
+  load_ms: 30000.0
+fallback_skill_id: null
+# Headline success rate from skills/diffusion-pusht/eval/pusht.json.
+benchmarks:
+  pusht: 0.60
+# PushT is a 2-DoF planar pushing benchmark; proprio state is 2-D
+# (x, y) of the end effector.
+policy_id: "diffusion"
+state_contract:
+  dim: 2
+paper_url: "https://arxiv.org/abs/2303.04137"
+source_repo: "hf://lerobot/diffusion_pusht"
+description: >
+  Diffusion Policy (~263M-param U-Net with 100-step DDPM denoiser) for
+  the PushT 2-DoF pushing benchmark. Action chunks of length 8 within a
+  horizon of 16. The chunk inference cost is dominated by the denoising
+  loop, so cached pops are essentially free — this is the extreme test
+  of the queue-drain contract.