TongZheng1999/FL_Qwen-3-4B-Instruct-star-mixed_direct-OP-final_v2_10-2-5Rounds-iter-1 Text Generation • 196k • Updated Nov 20, 2025 • 1
TongZheng1999/FL_Qwen-3-4B-Instruct-star-mixed_direct-OP-final_v2_10-2-5Rounds-iter-2 Text Generation • 196k • Updated Nov 20, 2025
TongZheng1999/FL_Qwen-3-4B-Instruct-star-mixed_direct-OP-final_v2_10-2-5Rounds-iter-3 Text Generation • 196k • Updated Nov 20, 2025 • 1
TongZheng1999/FL_Qwen-3-4B-Instruct-star-mixed_direct-OP-final_v2_10-2-5Rounds-iter-4 Text Generation • 196k • Updated Nov 20, 2025
TongZheng1999/FL_Qwen-3-4B-Instruct-star-mixed_direct-OP-final_v2_10-2-5Rounds-iter-5 Text Generation • 196k • Updated Nov 20, 2025 • 2
TongZheng1999/PF_Qwen-3-4B-Instruct-star-mixed_direct-OP-final_v2_10-2-5Rounds-iter-1 Text Generation • 196k • Updated Nov 20, 2025 • 1
TongZheng1999/PF_Qwen-3-4B-Instruct-star-mixed_direct-OP-final_v2_10-2-5Rounds-iter-2 Text Generation • 196k • Updated Nov 20, 2025
TongZheng1999/PF_Qwen-3-4B-Instruct-star-mixed_direct-OP-final_v2_10-2-5Rounds-iter-3 Text Generation • 196k • Updated Nov 20, 2025 • 1
TongZheng1999/PF_Qwen-3-4B-Instruct-star-mixed_direct-OP-final_v2_10-2-5Rounds-iter-4 Text Generation • 196k • Updated Nov 20, 2025 • 1
TongZheng1999/PF_Qwen-3-4B-Instruct-star-mixed_direct-OP-final_v2_10-2-5Rounds-iter-5 Text Generation • 196k • Updated Nov 20, 2025 • 1
Sean13/mistral-7b-instruct-v0.2-cpo-full-label_smoothing-0.1 Text Generation • 266k • Updated Nov 21, 2025 • 1
Sean13/llama-8b-instruct-simpo-full-label_smoothing-0.1 Text Generation • 266k • Updated Nov 21, 2025 • 1
Gabe-Thomp/gemma-sft-bayesian-lr2.0e-06-with-preferences-assistant-only-ONE-EPOCH Text Generation • 606k • Updated Nov 21, 2025
Gabe-Thomp/gemma-sft-BED-LLM-VANILLA-lr2.0e-06_assistant_only Text Generation • 606k • Updated Jan 28 • 1
Gabe-Thomp/gemma-sft-BED-LLM-VANILLA-lr2.0e-07_assistant_only Text Generation • 606k • Updated Jan 30
lihux/gemma-2-2b-it-star-nl-OP-final_v2_10-6-3Rounds-iter-1 Text Generation • 3B • Updated Mar 18 • 1