YAML Metadata Warning:empty or missing yaml metadata in repo card
Check out the documentation for more information.
64GPU_base_32b-55-32B
RL checkpoint (global_step 55) from run rl__64GPU_base_32b__exp_rpt_pymethods2test-large, a 32B agentic RL arm of experiment #217.
- Base model: Qwen/Qwen3-32B
- Dataset: DCAgent/exp_rpt_pymethods2test-large
- Selection: best trailing-5 EMA of reward/avg_raw_reward (EMA=0.4211 at step 55; reached max_steps 60)
Training Traces
Training-time Daytona/Harbor rollouts for this run are uploaded as a companion dataset: penfever/64GPU_base_32b
The dataset contains the last episode of each trial (per make_and_upload_trace_dataset --episodes last) -- the same rollouts the policy was trained on after rollback / truncation.
- Downloads last month
- 31
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support