--- license: apache-2.0 language: - en pipeline_tag: robotics library_name: transformers tags: - moe - rio2 - diffusion-jepa - safetensors datasets: - allenai/MolmoAct2-SO100_101-Dataset - allenai/MolmoAct2-DROID-Dataset - allenai/MolmoAct2-LIBERO-Dataset --- ![](https://huggingface.co/hoguai/RIO-2/resolve/main/rio2.png) **RIO-2** RIO-2 is a two-rate WAM(World Action Model) built for robotics. RIO-2 is composed with a low-frequency visual-language S2 backbone and a high-frequency JEPA-diffusion S1 action policy. The model is designed to separate slow scene understanding from fast robot control: • S2 refreshes visual-language context at low frequency. • Bridge/compressor modules convert S2 context into compact action-conditioning tokens. • S1 runs high-frequency action generation from cached S2 tokens and robot state. • JEPA latent prediction provides an auxiliary future-action representation. • A 10-expert S1 MoE residual path expands action capacity while keeping top-1 expert activation efficient. ![image](https://huggingface.co/hoguai/RIO-2/resolve/main/rio2diagram.png) RIO-2 uses JEPA-diffusion S1 action policy for general and flexible robot control in high frequency. S1 is MoE policy with 10 experts. Each expert is 100M parameter size. RIO-2's task memory maintains a small EMA latent memory over recent S2 context for longer-horizon task continuity. S2 policy is inspires by allenai/MolmoAct2. **This repo uses Hub custom code. Pass trust_remote_code=True until RIO-2 is merged into Transformers.** **RIO-2 is trained with allenai's opened datasets.** **Key Configuration** ``` state_dim: 6 action_dim: 6 action_horizon: 30 s2_token_count: 16 s2_width: 1024 s1_width: 384 s1_layers: 6 s1_heads: 8 s1_policy_mode: jepa_diffusion s1_moe_num_experts: 10 s1_moe_top_k: 1 dtype: bfloat16 ``` **How To Load RIO-2 In Python** ```python import torch from transformers import AutoModel, AutoProcessor model = AutoModel.from_pretrained( "hoguai/RIO-2", trust_remote_code=True, torch_dtype=torch.bfloat16, device_map="auto", ) processor = AutoProcessor.from_pretrained( "hoguai/RIO-2", trust_remote_code=True, ) model.load_s2_base(device="cuda") model.refresh_s2(image, "pick up the red cube", force=True) actions = model.act_fast(state, steps=2) ``` **Runtime Pattern** RIO-2 is intended to run as a two-rate policy: 1. Refresh S2 when the scene or instruction changes, or at a low fixed rate. 2. Reuse cached S2 tokens inside the high-frequency control loop. 3. Call act_fast() repeatedly with the latest robot state. 4. Execute only the safe portion of the returned action chunk through an external safety controller. **Safety** RIO-2 outputs continuous robot actions and must not be connected directly to real hardware. Always place the policy behind a robot safety layer with joint limits, velocity/acceleration/jerk limits, workspace constraints, watchdog, E-stop, and a fallback controller.