jayzou3773/qwen3_5-moe-neuron_structure_drop-p50-s1k-128samples-thinking-sft 19B • Updated 1 day ago • 14
jayzou3773/qwen3_5-moe-expert_drop-pure_gradient_pruning-r128-s1k-128samples-thinking-sft 19B • Updated 1 day ago • 10
jayzou3773/qwen3_5-moe-expert_drop-pure_expert_gradient_pruning-r128-s1k-128samples-thinking-sft 19B • Updated 1 day ago • 11
jayzou3773/qwen3_5-moe-expert_drop-layerwise_pruning-r128-s1k-128samples-thinking-sft 19B • Updated 1 day ago • 11
jayzou3773/qwen3_5-moe-expert_drop-bias_pruning-r128-s1k-128samples-thinking-sft 19B • Updated 1 day ago • 12
jayzou3773/qwen3-moe-neuron_structure_drop-p50-s1k-128samples-thinking-sft 211k • Updated 1 day ago • 1
jayzou3773/qwen3-moe-expert_drop-pure_gradient_pruning-r64-s1k-128samples-thinking-sft 211k • Updated 1 day ago • 10
jayzou3773/qwen3-moe-expert_drop-pure_expert_gradient_pruning-r64-s1k-128samples-thinking-sft 211k • Updated 1 day ago • 11
jayzou3773/qwen3-moe-expert_drop-layerwise_pruning-r64-s1k-128samples-thinking-sft 211k • Updated 1 day ago • 10
jayzou3773/qwen3-moe-expert_drop-bias_pruning-r64-s1k-128samples-thinking-sft 211k • Updated 1 day ago • 14
jayzou3773/qwen3_5-moe-neuron_structure_drop-p50-s1k-128samples-thinking 19B • Updated 6 days ago • 84
jayzou3773/qwen3_5-moe-expert_drop-weight_magnitude_pruning-r128-s1k-128samples-thinking 19B • Updated 6 days ago • 15
jayzou3773/qwen3_5-moe-expert_drop-pure_gradient_pruning-r128-s1k-128samples-thinking 19B • Updated 6 days ago • 69
jayzou3773/qwen3_5-moe-expert_drop-pure_expert_gradient_pruning-r128-s1k-128samples-thinking 19B • Updated 6 days ago • 70
jayzou3773/qwen3_5-moe-expert_drop-layerwise_pruning-r128-s1k-128samples-thinking 19B • Updated 6 days ago • 69
jayzou3773/qwen3_5-moe-expert_drop-bias_pruning-r128-s1k-128samples-thinking 19B • Updated 6 days ago • 926
jayzou3773/qwen3-moe-expert_drop-pure_gradient_pruning-r64-s1k-128samples-thinking 16B • Updated 8 days ago • 106
jayzou3773/qwen3-moe-expert_drop-pure_expert_gradient_pruning-r64-s1k-128samples-thinking 16B • Updated 8 days ago • 128
jayzou3773/qwen3-moe-expert_drop-layerwise_pruning-r64-s1k-128samples-thinking 16B • Updated 8 days ago • 158
jayzou3773/qwen3-moe-expert_drop-bias_pruning-r64-s1k-128samples-thinking 16B • Updated 8 days ago • 156
jayzou3773/qwen3-moe-neuron_structure_drop-p50-s1k-128samples-thinking 16B • Updated 8 days ago • 286
jayzou3773/qwen3_5-moe-expert_drop-weight_magnitude_pruning-r128-s1k-128samples 19B • Updated 11 days ago • 202