NAKA β Code-as-Policy Robot Executor
Part of the ANIMA Robotics Intelligence Suite by Robot Flow Labs
NAKA is a Code-as-Policy executor that uses LLMs to generate Python code for robot manipulation. Instead of fixed policies, the robot writes its own programs β composing perception, planning, and control primitives to solve novel tasks.
Results β Exceeds Human Expert
| Task | S1 (N=50) | S2 (N=50) | M1 (N=15) | Paper Best | Human Expert |
|---|---|---|---|---|---|
| Cube Lift | 90% | 100% | 100% | 45% | 93% |
| Cube Stack | 72% | 80% | 93% | 30% | 73% |
- cube_lift S2 = 100% β surpasses human expert (93%) without any RL training
- Exceeds CaP-X paper (NVIDIA, UC Berkeley, Stanford, CMU) by 2x on all configurations
- Zero-shot code generation with MiniMax M2.7
Paper
CaP-X: A Framework for Benchmarking and Improving Coding Agents for Robot Manipulation Max Fu, Justin Yu, Karim El-Refai, Ethan Kou, Haoru Xue, et al. NVIDIA, UC Berkeley, Stanford, Carnegie Mellon University arXiv 2603.22435
Architecture
User: "Pick up the red cube and lift it"
β ANIMA Compiler β NAKA Module
β LLM generates Python code:
pos, quat = sample_grasp_pose("red cube")
open_gripper()
goto_pose(pos, quat, z_approach=0.1)
close_gripper()
lift = pos + np.array([0, 0, 0.25])
goto_pose(lift, quat)
β Code executes on real robot (MuJoCo sim or Franka Panda)
β Reward = 1.0 (task completed)
7 Benchmark Tasks
- Cube Lift β single arm pick and lift
- Cube Stack β pick and stack one cube on another
- Spill Wipe β wipe table surface with attached sponge
- Peg Insertion β precision assembly (nut on peg)
- Cube Re-stack β bimanual reordering
- Two-Arm Lift β bimanual coordinated lift
- Two-Arm Handover β bimanual object transfer
Key Innovations
- Prompt Engineering > RL Training β our system prompt with API examples achieves human-level without any model fine-tuning
- OSC_POSE Controller β delta-based arm control, more reliable than joint-position control
- Fuzzy Object Matching β "red cube" matches MuJoCo body "cube_main" automatically
- Multi-Turn with Reward Feedback β agent sees reward + completion status, retries on failure
- Parallel Ensemble β MiniMax M2.7 + GLM 5.1 queried in parallel for robust code synthesis
Files
| Path | Description |
|---|---|
paper.pdf |
CaP-X paper (arXiv 2603.22435) |
BENCHMARK_REPORT.md |
Full benchmark results with paper comparison |
TRAINING_REPORT.md |
Comprehensive training and infrastructure report |
configs/ |
Training configurations (GRPO, debug) |
logs/ |
Benchmark metrics (JSON) |
code/ |
Key source files for reproducibility |
GRPO RL Training (Ready)
NAKA includes a GRPO (Group Relative Policy Optimization) trainer for fine-tuning Qwen2.5-7B-Instruct on the CaP-Gym tasks. The training loop is verified but full training requires dedicated GPU (RTX 6000 Pro).
Expected results from paper: Qwen 7B goes from 25% β 80% on cube_lift after 50 GRPO iterations.
Docker
docker build -f Dockerfile.sim -t naka-sim:latest .
docker run --gpus '"device=0"' --network host naka-sim:latest
# Open browser: http://localhost:8090/
ANIMA Integration
NAKA is a module in the ANIMA robotics compiler. When ANIMA encounters a novel task that can't be solved by fixed pipelines, it routes to NAKA β which generates custom Python code composing other ANIMA modules (perception, planning, control).
License
MIT β Robot Flow Labs / AIFLOW LABS LIMITED
Citation
@article{fu2026capx,
title={CaP-X: A Framework for Benchmarking and Improving Coding Agents for Robot Manipulation},
author={Fu, Max and Yu, Justin and El-Refai, Karim and Kou, Ethan and Xue, Haoru and others},
journal={arXiv preprint arXiv:2603.22435},
year={2026}
}