Zachary1150/merge_lenfmt_MRL4096_ROLLOUT4_LR2e-6_w0.5_linear Text Generation • 2B • Updated Dec 24, 2025 • 2
Zachary1150/merge_lenfmt_MRL4096_ROLLOUT4_LR2e-6_w0.7_linear Text Generation • 2B • Updated Dec 24, 2025 • 2
Zachary1150/merge_lenfmt_MRL4096_ROLLOUT4_LR2e-6_w0.9_linear Text Generation • 2B • Updated Dec 24, 2025 • 1
Zachary1150/merge_cosfmt_MRL4096_ROLLOUT4_LR2e-6_w0.1_linear Text Generation • 2B • Updated Dec 24, 2025 • 2
Zachary1150/merge_cosfmt_MRL4096_ROLLOUT4_LR2e-6_w0.3_linear Text Generation • 2B • Updated Dec 24, 2025 • 2
Zachary1150/merge_cosfmt_MRL4096_ROLLOUT4_LR2e-6_w0.5_linear Text Generation • 2B • Updated Dec 24, 2025 • 2
Zachary1150/merge_cosfmt_MRL4096_ROLLOUT4_LR2e-6_w0.7_linear Text Generation • 2B • Updated Dec 24, 2025 • 2
Zachary1150/merge_cosfmt_MRL4096_ROLLOUT4_LR2e-6_w0.9_linear Text Generation • 2B • Updated Dec 24, 2025 • 4
Zachary1150/merge_accfmt_MRL4096_ROLLOUT4_LR2e-6_w0.1_linear Text Generation • 2B • Updated Dec 24, 2025 • 2
Zachary1150/merge_accfmt_MRL4096_ROLLOUT4_LR2e-6_w0.3_linear Text Generation • 2B • Updated Dec 24, 2025 • 1
Zachary1150/merge_accfmt_MRL4096_ROLLOUT4_LR2e-6_w0.5_linear Text Generation • 2B • Updated Dec 24, 2025 • 2
Zachary1150/merge_accfmt_MRL4096_ROLLOUT4_LR2e-6_w0.7_linear Text Generation • 2B • Updated Dec 24, 2025 • 2
Zachary1150/merge_accfmt_MRL4096_ROLLOUT4_LR2e-6_w0.9_linear Text Generation • 2B • Updated Dec 24, 2025 • 1
Zachary1150/merge_lenfmt_MRL4096_ROLLOUT4_LR5e-7_w0.1_linear Text Generation • 2B • Updated Dec 20, 2025 • 1
Zachary1150/merge_lenfmt_MRL4096_ROLLOUT4_LR5e-7_w0.3_linear Text Generation • 2B • Updated Dec 20, 2025 • 2
Zachary1150/merge_lenfmt_MRL4096_ROLLOUT4_LR5e-7_w0.5_linear Text Generation • 2B • Updated Dec 20, 2025 • 1