Zachary1150/merge_lenfmt_MRL4096_ROLLOUT4_LR5e-7_w0.7_linear Text Generation • 2B • Updated Dec 20, 2025 • 1
Zachary1150/merge_lenfmt_MRL4096_ROLLOUT4_LR5e-7_w0.9_linear Text Generation • 2B • Updated Dec 20, 2025 • 2
Zachary1150/merge_cosfmt_MRL4096_ROLLOUT4_LR5e-7_w0.1_linear Text Generation • 2B • Updated Dec 20, 2025 • 1
Zachary1150/merge_cosfmt_MRL4096_ROLLOUT4_LR5e-7_w0.3_linear Text Generation • 2B • Updated Dec 20, 2025 • 2
Zachary1150/merge_cosfmt_MRL4096_ROLLOUT4_LR5e-7_w0.5_linear Text Generation • 2B • Updated Dec 20, 2025 • 2
Zachary1150/merge_cosfmt_MRL4096_ROLLOUT4_LR5e-7_w0.7_linear Text Generation • 2B • Updated Dec 20, 2025 • 1
Zachary1150/merge_cosfmt_MRL4096_ROLLOUT4_LR5e-7_w0.9_linear Text Generation • 2B • Updated Dec 20, 2025 • 1
Zachary1150/merge_accfmt_MRL4096_ROLLOUT4_LR5e-7_w0.1_linear Text Generation • 2B • Updated Dec 20, 2025 • 2
Zachary1150/merge_accfmt_MRL4096_ROLLOUT4_LR5e-7_w0.3_linear Text Generation • 2B • Updated Dec 20, 2025 • 2
Zachary1150/merge_accfmt_MRL4096_ROLLOUT4_LR5e-7_w0.5_linear Text Generation • 2B • Updated Dec 20, 2025 • 2
Zachary1150/merge_accfmt_MRL4096_ROLLOUT4_LR5e-7_w0.7_linear Text Generation • 2B • Updated Dec 20, 2025 • 2
Zachary1150/merge_accfmt_MRL4096_ROLLOUT4_LR5e-7_w0.9_linear Text Generation • 2B • Updated Dec 20, 2025 • 1
Zachary1150/merge_linear_cos0.9fmt0.1_MRL4096_ROLLOUT4_LR1e-6 Text Generation • 2B • Updated Dec 11, 2025 • 2
Zachary1150/merge_linear_cos0.7fmt0.3_MRL4096_ROLLOUT4_LR1e-6 Text Generation • 2B • Updated Dec 11, 2025 • 2
Zachary1150/merge_linear_cos0.5fmt0.5_MRL4096_ROLLOUT4_LR1e-6 Text Generation • 2B • Updated Dec 11, 2025 • 2
Zachary1150/merge_linear_cos0.3fmt0.7_MRL4096_ROLLOUT4_LR1e-6 Text Generation • 2B • Updated Dec 11, 2025 • 1
Zachary1150/merge_linear_cos0.1fmt0.9_MRL4096_ROLLOUT4_LR1e-6 Text Generation • 2B • Updated Dec 11, 2025 • 2
Zachary1150/merge_linear_len0.9fmt0.1_MRL4096_ROLLOUT4_LR1e-6 Text Generation • 2B • Updated Dec 11, 2025 • 2
Zachary1150/merge_linear_len0.7fmt0.3_MRL4096_ROLLOUT4_LR1e-6 Text Generation • 2B • Updated Dec 11, 2025 • 1
Zachary1150/merge_linear_len0.5fmt0.5_MRL4096_ROLLOUT4_LR1e-6 Text Generation • 2B • Updated Dec 11, 2025 • 1
Zachary1150/merge_linear_len0.3fmt0.7_MRL4096_ROLLOUT4_LR1e-6 Text Generation • 2B • Updated Dec 11, 2025 • 4
Zachary1150/merge_linear_len0.1fmt0.9_MRL4096_ROLLOUT4_LR1e-6 Text Generation • 2B • Updated Dec 11, 2025 • 2