HectorHe/Deepseek-Coder-V2-Lite-13B-Instruct-sft-math7k Text Generation • 16B • Updated Aug 17, 2025 • 255 • 2
HectorHe/Deepseek-Coder-V2-Lite-13B-Instruct-sft-math14k Text Generation • 16B • Updated Aug 17, 2025 • 3 • 1
HectorHe/Deepseek-Coder-V2-Lite-13B-Instruct-sft-s1K Text Generation • 16B • Updated Aug 17, 2025 • 11 • 1
HectorHe/Deepseek-Coder-V2-Lite-13B-Instruct-sft-nemotron-code Text Generation • 126k • Updated Aug 17, 2025 • 7 • 1
HectorHe/Deepseek-Coder-V2-Lite-13B-Instruct-Math10K-diff-info-Distill-mixture-new 16B • Updated Jul 23, 2025 • 2
HectorHe/Deepseek-Coder-V2-Lite-13B-Instruct-Math10K-diff-info-Distill-forward-kl-new 16B • Updated Jul 22, 2025 • 2
HectorHe/Deepseek-Coder-V2-Lite-13B-Instruct-Math10K-Distill-6-experts-test-may 3B • Updated Jul 14, 2025 • 1
HectorHe/Deepseek-Coder-V2-Lite-13B-Instruct-Math10K-diff-info-Distill-mixture 16B • Updated Jul 10, 2025 • 2
HectorHe/Deepseek-Coder-V2-Lite-13B-Instruct-Math10K-diff-info-Distill-forward-kl 16B • Updated Jul 10, 2025 • 1
HectorHe/Deepseek-Coder-V2-Lite-13B-Instruct-Math10K-diff-info-Distill-token-specific 16B • Updated Jul 10, 2025 • 1
HectorHe/Deepseek-Coder-V2-Lite-13B-Instruct-Math10K-diff-info-Distill-token-specific-scale 16B • Updated Jul 10, 2025 • 5
HectorHe/Deepseek-Coder-V2-Lite-13B-Instruct-Math10K-Distill-6-experts-test-new-module Updated Jul 7, 2025
HectorHe/Deepseek-Coder-V2-Lite-13B-Instruct-Math10K-Distill-6-experts-token-specific 3B • Updated Jul 1, 2025 • 1
HectorHe/Deepseek-Coder-V2-Lite-13B-Instruct-Math10K-Distill-6-experts-token-specific-3-scaled 3B • Updated Jul 1, 2025 • 2
HectorHe/Deepseek-Coder-V2-Lite-13B-Instruct-Math10K-Distill-6-experts-test-token-specific-5-epoch 3B • Updated Jun 23, 2025 • 2
HectorHe/Deepseek-Coder-V2-Lite-13B-Instruct-Math10K-Distill-6-experts-test-token-specific 3B • Updated Jun 17, 2025 • 2