zjhhhh/qwen2.5_3B_Instruct_multi_gap_beta_1_eta_1e4_step_382_final Text Generation • 3B • Updated Sep 25, 2025
zjhhhh/qwen2.5_3B_Instruct_5selection_multi_beta_1_eta_1e6_step_312_final Text Generation • 3B • Updated Sep 25, 2025
zjhhhh/qwen2.5_3B_Instruct_fixed_gap_beta_1_eta_1e4_step_382_final Text Generation • 3B • Updated Sep 24, 2025
zjhhhh/qwen2.5_3B_Instruct_5selection_fixed_beta_1_eta_1e6_step_312_final Text Generation • 3B • Updated Sep 24, 2025
zjhhhh/qwen2.5_3B_Instruct_5selection_fixed_beta_1_eta_1e5_step_312_final Text Generation • 3B • Updated Sep 24, 2025
zjhhhh/qwen2.5_3B_Instruct_fixed_expand_beta_1_eta_1e4_step_953_final Text Generation • 3B • Updated Sep 23, 2025
zjhhhh/qwen2.5_3B_Instruct_fixed_bn_beta_1_eta_1e4_step_312_final Text Generation • 3B • Updated Sep 23, 2025
zjhhhh/qwen2.5_3B_Instruct_fixed_beta_1_eta_2.5e3_step_312_final Text Generation • 3B • Updated Sep 23, 2025
zjhhhh/qwen2.5_3B_Instruct_reward_beta_1_eta_1e5_step_312_final Text Generation • 3B • Updated Sep 22, 2025
zjhhhh/qwen2.5_3B_Instruct_reward_beta_1_eta_1e4_step_312_final Text Generation • 3B • Updated Sep 22, 2025
zjhhhh/qwen2.5_3B_Instruct_5selection_fixed_beta_1_eta_1e4_step_312_final Text Generation • 3B • Updated Sep 22, 2025
zjhhhh/qwen2.5_3B_Instruct_fixed_beta_1_eta_1e6_step_312_final Text Generation • 3B • Updated Sep 21, 2025
zjhhhh/qwen2.5_3B_Instruct_nocheck_beta_1_eta_1e4_step_312_final Text Generation • 3B • Updated Sep 21, 2025
zjhhhh/qwen2.5_3B_Instruct_nocheck_beta_1_eta_1e4_step_241 Text Generation • 3B • Updated Sep 21, 2025
zjhhhh/qwen2.5_3B_Instruct_nocheck_beta_1_eta_1e4_step_161 Text Generation • 3B • Updated Sep 21, 2025
zjhhhh/qwen2.5_3B_Instruct_nocheck_beta_1_eta_1e4_step_81 Text Generation • 3B • Updated Sep 21, 2025
zjhhhh/qwen2.5_3B_Instruct_multi_beta_1_1e4_step_312_final Text Generation • 3B • Updated Sep 21, 2025
zjhhhh/qwen2.5_3B_Instruct_fixed_beta_1_1e5_step_312_final Text Generation • 3B • Updated Sep 21, 2025
zjhhhh/qwen2.5_3B_Instruct_reward_whole_1e5_step_312_final Text Generation • 3B • Updated Sep 20, 2025
zjhhhh/qwen2.5_3B_Instruct_reward_whole_1e4_step_312_final Text Generation • 3B • Updated Sep 19, 2025