YAML Metadata Warning:empty or missing yaml metadata in repo card

Check out the documentation for more information.

lrboost-80-8B

RL checkpoint (global_step 80) from run rl__56GPU_seqnorm_tis_untrunc_lrboost__exp_rpt_pymethods2test-large, an 8B agentic RL arm of experiment #217.

  • Base model: laion/GLM-4_7-swesmith-sandboxes-with_tests-oracle_verified_120s-maxeps-131k-fixthink
  • Dataset: DCAgent/exp_rpt_pymethods2test-large
  • Selection: best trailing-5 EMA of reward/avg_raw_reward (EMA=0.5187 at step 80; reached max_steps 80)

Training Traces

Training-time Daytona/Harbor rollouts for this run are uploaded as a companion dataset: penfever/lrboost

The dataset contains the last episode of each trial (per make_and_upload_trace_dataset --episodes last) -- the same rollouts the policy was trained on after rollback / truncation.

Downloads last month
22
Safetensors
Model size
8B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for laion/lrboost-80-8B

Quantizations
1 model