MMR-GRPO-lambda-0.5 / reward_data

Commit History

Training in progress, step 500
97a871f
verified

kangdawei commited on

Training in progress, step 450
3e03734
verified

kangdawei commited on

Training in progress, step 400
a77d33f
verified

kangdawei commited on

Training in progress, step 350
6123195
verified

kangdawei commited on

Training in progress, step 250
748f4dd
verified

kangdawei commited on

Training in progress, step 200
1af649f
verified

kangdawei commited on

Training in progress, step 150
970e43f
verified

kangdawei commited on

Training in progress, step 100
86942cc
verified

kangdawei commited on