HectorHe/DeepSeek-V2-Lite-aux-free-sft-math7k-1epoch-1e-4-gamma-share-experts-2nd-epoch-lr-1e-6 Text Generation • 126k • Updated Sep 19, 2025 • 5
HectorHe/DeepSeek-V2-Lite-aux-free-sft-math7k-1epoch-1e-4-gamma-share-experts-2nd-epoch-lr-5e-6 Text Generation • 16B • Updated Sep 19, 2025 • 6
HectorHe/DeepSeek-V2-Lite-aux-free-sft-math7k-1epoch-1e-4-gamma-share-experts-2nd-epoch-high-bias-expert Text Generation • 16B • Updated Sep 21, 2025 • 6
xd2010/Qwen1.5-MOE-aux-free-sft-math7k-1e-3-gamma-part2-test Text Generation • 14B • Updated Mar 23 • 52
xd2010/DeepSeek-V2-Lite-aux-free-sft-math7k-2ndepoch-1e-4-gamma-4condenser Text Generation • 16B • Updated 27 days ago • 601
xd2010/DeepSeek-V2-Lite-aux-free-sft-math7k-2ndepoch-1e-4-gamma-1condenser Text Generation • 16B • Updated 27 days ago • 616
xd2010/DeepSeek-V2-Lite-aux-free-sft-math7k-2epoch-frozen-router Text Generation • 16B • Updated 27 days ago • 312
xd2010/OLMoE-1B-7B-0125-sft-math7k-2epochs-frozen-router Text Generation • 7B • Updated 27 days ago • 223
xd2010/Qwen1.5-MOE-sft-math7k-sft-2epochs-frozen-router Text Generation • 14B • Updated 27 days ago • 233