xw1234gan/Merging_Qwen2.5-1.5B-Instruct_MMLU_lr1e-05_mb2_ga128_n2048_seed42 Text Generation • 2B • Updated Mar 21 • 3
xw1234gan/Merging_Qwen2.5-1.5B-Instruct_MMLU_lr1e-05_mb2_ga4_n16_seed42 Text Generation • 2B • Updated Mar 21
xw1234gan/Fixed_Merging_Qwen2.5-1.5B-Instruct_MMLU_lr1e-05_mb2_ga128_n2048_seed42 Text Generation • 2B • Updated Mar 21 • 2
xw1234gan/Fixed_Merging_Qwen2.5-1.5B-Instruct_MMLU_lr1e-05_mb2_ga4_n16_seed42 Text Generation • 2B • Updated Mar 20 • 5
xw1234gan/GRPO_KL_Qwen2.5-1.5B-Instruct_MMLU_beta0.01_lr1e-05_mb2_ga128_n2048_seed42 Text Generation • 2B • Updated Mar 20
xw1234gan/SMOKE_GRPO_KL_1.5B_Qwen2.5-1.5B-Instruct_MMLU_beta0.01_lr1e-05_mb2_ga4_n16_seed42 Text Generation • 2B • Updated Mar 20
xw1234gan/Merging_Qwen2.5-3B-Instruct_MMLU_lr1e-05_mb2_ga128_n2048_seed42 Text Generation • 3B • Updated Mar 20
xw1234gan/Merging_Qwen2.5-3B-Instruct_MMLU_lr1e-05_mb2_ga4_n16_seed42 Text Generation • 3B • Updated Mar 20
xw1234gan/Fixed_Merging_Qwen2.5-3B-Instruct_MMLU_lr1e-05_mb2_ga128_n2048_seed42 Text Generation • 3B • Updated Mar 20 • 1
xw1234gan/Fixed_Merging_Qwen2.5-3B-Instruct_MMLU_lr1e-05_mb2_ga4_n16_seed42 Text Generation • 3B • Updated Mar 20
xw1234gan/GRPO_KL_Qwen2.5-3B-Instruct_MMLU_beta0.01_lr1e-05_mb2_ga128_n2048_seed42 Text Generation • 3B • Updated Mar 20 • 5
xw1234gan/SMOKE_GRPO_KL_Qwen2.5-3B-Instruct_MMLU_beta0.01_lr1e-05_mb2_ga4_n16_seed42 Text Generation • 3B • Updated Mar 20
xw1234gan/Merging_Qwen2.5-3B-Instruct_MedQA_lr1e-05_mb2_ga128_n2048_seed42 Text Generation • 3B • Updated Mar 17