mzio/aprm-sft_genthinkact-ENtextworld_treasure_hunter-GEaprm_qwen3_ap-SE42-RE0-ap1-b0000 Viewer • Updated Jan 21 • 3.2k • 10
mzio/aprm_sft_thought_action_rollouts-ENhotpotqa_mc_gpt5_gen4s_GEaprm_qwen3_ap_SE42_RE4-ap1_best_0020 Viewer • Updated Jan 19 • 1.59k • 11
mzio/aprm_sft_thought_action_rollouts-ENhotpotqa_mc_gpt5_gen4s_GEaprm_qwen3_ap_SE42_RE4-ap1_0019 Viewer • Updated Jan 19 • 1.59k • 11
mzio/aprm_sft_thought_action_rollouts-ENhotpotqa_mc_default_GEaprm_qwen3_ap_SE42_RE5-ap1_best_0040 Viewer • Updated Jan 19 • 12.7k • 10
mzio/aprm_sft_thought_action_rollouts-ENhotpotqa_mc_default_GEaprm_qwen3_ap_SE42_RE5-ap1_0039 Viewer • Updated Jan 19 • 12.7k • 12
mzio/rb_last-cql-mc_oai_gpt5_low-ec_hotpotqa_mc_gpt5-ds_train-spp1-gbs1-s42-r_0-v5649-tools Viewer • Updated Jan 19 • 5.65k • 11
mzio/aprm_sft_thought_action_rollouts-act_prm_browsecomp_100_hide_obs_aprm_qwen3_ap_42_2-ap_rl0020 Viewer • Updated Jan 19 • 9.91k • 28
mzio/aprm_sft_thought_action_rollouts-act_prm_browsecomp_100_hide_obs_aprm_qwen3_ap_42_2-ap_rl0010 Viewer • Updated Jan 19 • 9.91k • 24
mzio/aprm_sft_thought_action_rollouts-act_prm_browsecomp_100_hide_obs_aprm_qwen3_ap_42_2-ap_rl0000 Viewer • Updated Jan 18 • 9.91k • 24
mzio/aprm_sft_thought_action_rollouts-act_prm_browsecomp_100_hide_obs_aprm_qwen3_ap_42_1-ap_rl0000 Viewer • Updated Jan 18 • 9.91k • 23
mzio/aprm_sft-act_prm_browsecomp_100_aprm_qwen3_ap_42_debug-0020 Viewer • Updated Jan 18 • 2.12k • 22
mzio/aprm_sft-act_prm_hotpotqa_mc_250_aprm_qwen3_ap_nobandit_42_0-0070 Viewer • Updated Jan 18 • 4.06k • 6
mzio/aprm_sft-act_prm_hotpotqa_mc_250_aprm_qwen3_ap_nobandit_42_0-0060 Viewer • Updated Jan 18 • 1.02k • 9
mzio/aprm_sft-act_prm_hotpotqa_mc_250_aprm_qwen3_ap_nobandit_42_0-0050 Viewer • Updated Jan 18 • 4.06k • 9