metadata
license: apache-2.0
tags:
- reinforcement-learning
- ppo
- robotics
- sanskrit
- paramtatva
- mujoco
- pybullet
- dm-control
- calvin
- rlbench
- metaworld
language:
- sa
R-PPO-SOTA — Robotics PPO Baseline Benchmarks
© 2026 ParamTatva.org — All Rights Reserved
State-of-the-art PPO baselines across 6 robotics benchmark suites, trained as part of the Robotics track (R) of the ParamTatva Resonance Language Model.
Benchmark Results
MuJoCo (Gymnasium v5) — 10M steps each
| Environment | Best Return | Steps |
|---|---|---|
| HalfCheetah-v5 | 5,803.9 | 10M |
| Walker2d-v5 | 4,918.5 | 10M |
| Hopper-v5 | 3,183.2 | 10M |
| Ant-v5 | 886.6 | 10M |
| Humanoid-v5 | 573.8 | 10M |
| Reacher-v5 | -4.2 | 10M |
MetaWorld (10 Tasks) — 500K steps each
| Task | Best Return | Success Rate |
|---|---|---|
| drawer-close-v3 | 9.5 | 95% |
| reach-v3 | 8.3 | 25% |
| window-open-v3 | 8.0 | 95% |
| drawer-open-v3 | 7.7 | 95% |
| button-press-topdown-v3 | 6.1 | 95% |
| window-close-v3 | 5.2 | 95% |
| door-open-v3 | 4.8 | 25% |
| push-v3 | 2.7 | 10% |
| peg-insert-side-v3 | 1.7 | 0% |
| pick-place-v3 | 0.4 | 5% |
CALVIN (5 Tasks) — ~3M steps each
| Task | Best Return |
|---|---|
| place-in-drawer | -3.2 |
| pick-up-block | -3.6 |
| turn-on-lightbulb | -3.7 |
| close-drawer | -4.2 |
| open-drawer | -6.1 |
DM Control Suite (Running)
| Task | Best Return |
|---|---|
| finger-spin | 621.0 |
| cartpole-swingup | 616.1 |
| (5 more tasks in progress on RTX 3090) |
PyBullet (Running on RTX 3090)
RLBench (Pending — Docker container required)
Architecture
Standard PPO with:
- Orthogonal weight initialization
- GAE (λ=0.95, γ=0.99)
- Linear LR annealing
- Gradient clipping (max norm 0.5)
- Centralized observation/reward normalization (SubprocVecEnv)
- Our proprietary encoder for Sanskrit-conditioned variants
Hardware
- T4×4 GPU (Google Cloud) — MuJoCo, MetaWorld, CALVIN
- RTX 3090 (Local) — DM Control, PyBullet, RLBench
Citation
@misc{paramtatva2026rpposota,
title={R-PPO-SOTA: Robotics PPO Baselines},
author={ParamTatva.org},
year={2026},
url={https://huggingface.co/ParamTatva/R-PPO-SOTA}
}
License
Apache 2.0 — © 2026 ParamTatva.org