R-PPO-SOTA / README.md
prabhatkr's picture
Upload README.md with huggingface_hub
99ba31a verified
metadata
license: apache-2.0
tags:
  - reinforcement-learning
  - ppo
  - robotics
  - sanskrit
  - paramtatva
  - mujoco
  - pybullet
  - dm-control
  - calvin
  - rlbench
  - metaworld
language:
  - sa

R-PPO-SOTA — Robotics PPO Baseline Benchmarks

© 2026 ParamTatva.org — All Rights Reserved

State-of-the-art PPO baselines across 6 robotics benchmark suites, trained as part of the Robotics track (R) of the ParamTatva Resonance Language Model.

Benchmark Results

MuJoCo (Gymnasium v5) — 10M steps each

Environment Best Return Steps
HalfCheetah-v5 5,803.9 10M
Walker2d-v5 4,918.5 10M
Hopper-v5 3,183.2 10M
Ant-v5 886.6 10M
Humanoid-v5 573.8 10M
Reacher-v5 -4.2 10M

MetaWorld (10 Tasks) — 500K steps each

Task Best Return Success Rate
drawer-close-v3 9.5 95%
reach-v3 8.3 25%
window-open-v3 8.0 95%
drawer-open-v3 7.7 95%
button-press-topdown-v3 6.1 95%
window-close-v3 5.2 95%
door-open-v3 4.8 25%
push-v3 2.7 10%
peg-insert-side-v3 1.7 0%
pick-place-v3 0.4 5%

CALVIN (5 Tasks) — ~3M steps each

Task Best Return
place-in-drawer -3.2
pick-up-block -3.6
turn-on-lightbulb -3.7
close-drawer -4.2
open-drawer -6.1

DM Control Suite (Running)

Task Best Return
finger-spin 621.0
cartpole-swingup 616.1
(5 more tasks in progress on RTX 3090)

PyBullet (Running on RTX 3090)

RLBench (Pending — Docker container required)

Architecture

Standard PPO with:

  • Orthogonal weight initialization
  • GAE (λ=0.95, γ=0.99)
  • Linear LR annealing
  • Gradient clipping (max norm 0.5)
  • Centralized observation/reward normalization (SubprocVecEnv)
  • Our proprietary encoder for Sanskrit-conditioned variants

Hardware

  • T4×4 GPU (Google Cloud) — MuJoCo, MetaWorld, CALVIN
  • RTX 3090 (Local) — DM Control, PyBullet, RLBench

Citation

@misc{paramtatva2026rpposota,
  title={R-PPO-SOTA: Robotics PPO Baselines},
  author={ParamTatva.org},
  year={2026},
  url={https://huggingface.co/ParamTatva/R-PPO-SOTA}
}

License

Apache 2.0 — © 2026 ParamTatva.org