Instructions to use OpenRAL/rskill-smolvla-robotwin with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- LeRobot
How to use OpenRAL/rskill-smolvla-robotwin with LeRobot:
# See https://github.com/huggingface/lerobot?tab=readme-ov-file#installation for more details git clone https://github.com/huggingface/lerobot.git cd lerobot pip install -e .[smolvla]
# Launch finetuning on your dataset python lerobot/scripts/train.py \ --policy.path=OpenRAL/rskill-smolvla-robotwin \ --dataset.repo_id=lerobot/svla_so101_pickplace \ --batch_size=64 \ --steps=20000 \ --output_dir=outputs/train/my_smolvla \ --job_name=my_smolvla_training \ --policy.device=cuda \ --wandb.enable=true
# Run the policy using the record function python -m lerobot.record \ --robot.type=so101_follower \ --robot.port=/dev/ttyACM0 \ # <- Use your port --robot.id=my_blue_follower_arm \ # <- Use your robot id --robot.cameras="{ front: {type: opencv, index_or_path: 8, width: 640, height: 480, fps: 30}}" \ # <- Use your cameras --dataset.single_task="Grasp a lego block and put it in the bin." \ # <- Use the same task description you used in your dataset recording --dataset.repo_id=HF_USER/dataset_name \ # <- This will be the dataset name on HF Hub --dataset.episode_time_s=50 \ --dataset.num_episodes=10 \ --policy.path=OpenRAL/rskill-smolvla-robotwin - Notebooks
- Google Colab
- Kaggle
rskill-smolvla-robotwin
OpenRAL rSkill — SmolVLA (0.45 B) finetuned on the RoboTwin 2.0 unified dual-arm dataset (50 bimanual SAPIEN tasks, aloha-agilex embodiment), packaged for use with the OpenRAL robot agent framework.
This package wraps
lerobot/smolvla_robotwin with a
rskill.yaml manifest that adds capability checking, license surfacing, latency
budgets, and local registry integration. It does not copy model weights.
What this skill does
A multi-task dual-arm policy for the RoboTwin 2.0 benchmark (Chen et al., arXiv 2506.18088). Action chunks of length 50 across three RGB views (head + per-wrist) driving a 14-DoF dual-arm joint command on the AgileX "aloha-agilex" embodiment.
| Field | Value |
|---|---|
| Actions | generalist, pick, place, transfer |
| Objects | block, pot, cup, hammer |
| Scenes | tabletop |
| Embodiment | aloha_agilex |
| Action space | 14-D joint position |
| Cameras | camera1 (head), camera2 (left wrist), camera3 (right wrist), 256×256 |
How it works
OpenRAL loads the upstream LeRobot SmolVLA policy from hf://lerobot/smolvla_robotwin
and uses the in-tree smolvla adapter to run chunked inference. RoboTwin itself runs in a
separate SAPIEN/CuRobo sidecar; the sidecar returns three RGB views plus the 14-D
aloha-agilex joint state, and the adapter replays 50-action chunks as absolute 14-D joint
position commands.
Sensors / observation contract
| Direction | Key | Shape | Notes |
|---|---|---|---|
| in | observation.images.camera1 |
(3, 256, 256) RGB |
Head / overhead view, re-keyed from RoboTwin head_camera. |
| in | observation.images.camera2 |
(3, 256, 256) RGB |
Left wrist view, re-keyed from RoboTwin left_camera. |
| in | observation.images.camera3 |
(3, 256, 256) RGB |
Right wrist view, re-keyed from RoboTwin right_camera. |
| in | observation.state |
(14,) float32 |
aloha-agilex dual-arm joint state. |
| out | action chunk | (50, 14) float32 |
Absolute dual-arm joint position commands. |
Manifest summary
| Field | Value |
|---|---|
name |
OpenRAL/rskill-smolvla-robotwin |
version |
0.1.0 |
license |
apache-2.0 |
role |
s1 |
model_family |
smolvla |
embodiment_tags |
aloha_agilex |
runtime / quantization.dtype |
pytorch / bf16 |
weights_uri |
hf://lerobot/smolvla_robotwin |
state_contract.dim / action_contract.dim |
14 / 14 |
chunk_size / n_action_steps |
50 / 50 |
latency_budget.per_chunk_ms |
250.0 |
evaluated_tasks |
robotwin |
How to run it
RoboTwin runs on SAPIEN out-of-process via a Python 3.10 sidecar (ADR-0061) — its stack is incompatible with the openral 3.12 venv. Provision the sidecar venv, then:
# openral-side wire (pyzmq + msgpack)
just sync --all-packages --group robotwin --inexact
# single task
openral benchmark scene \
--config scenes/benchmark/robotwin_lift_pot.yaml \
--rskill rskills/smolvla-robotwin
# the 5-task suite
openral benchmark run --suite robotwin --vla smolvla:rskills/smolvla-robotwin
See ADR-0061 for the SAPIEN+RoboTwin sidecar provisioning recipe
(OPENRAL_ROBOTWIN_AUTO_PROVISION=1 or the manual conda recipe).
Provenance
- Weights:
lerobot/smolvla_robotwin(Apache-2.0), baselerobot/smolvla_base. - Dataset:
lerobot/robotwin_unified(Apache-2.0;pepijn223/robotwin_unified_v3renamed). - Eval protocol: RoboTwin official — 100 episodes/task, sim built-in success,
episode_length=300. No locally-reproduced official numbers shipped yet (eval/is empty). The current website artifact is a 150-step GPUopenral benchmark scenesmoke clip (robotwin_smolvla-robotwin_fail.mp4,success=False) for visual validation only; populateeval/withopenral benchmark run --suite robotwinon the eval host.
STATE NOTE: the live RoboTwin sidecar returns a 14-D aloha-agilex state, and the official
policy_preprocessor.jsonnormalization stats expectobservation.stateshape(14,);rskill.yamlpinsstate_contract.dim: 14accordingly (ADR-0061 §Live verification).
License
This rSkill wrapper, the upstream lerobot/smolvla_robotwin checkpoint, and the
lerobot/robotwin_unified dataset are Apache-2.0. The package does not copy weights into
this repository; runtime loading still emits OpenRAL's unverified-provenance warning until
the planned signing control exists.