jayzou3773/qwen3_5-moe-expert_drop-pure_gradient_pruning-r128-s1k-128samples-thinking 19B • Updated Apr 29 • 1
jayzou3773/qwen3_5-moe-expert_drop-pure_expert_gradient_pruning-r128-s1k-128samples-thinking 19B • Updated Apr 29 • 2
jayzou3773/qwen3_5-moe-expert_drop-layerwise_pruning-r128-s1k-128samples-thinking 19B • Updated Apr 29 • 1
jayzou3773/qwen3_5-moe-expert_drop-bias_pruning-r128-s1k-128samples-thinking 19B • Updated Apr 29 • 2
jayzou3773/qwen3-moe-expert_drop-pure_gradient_pruning-r64-s1k-128samples-thinking 16B • Updated Apr 28 • 2
jayzou3773/qwen3-moe-expert_drop-pure_expert_gradient_pruning-r64-s1k-128samples-thinking 16B • Updated Apr 28 • 1
jayzou3773/qwen3-moe-expert_drop-layerwise_pruning-r64-s1k-128samples-thinking 16B • Updated Apr 28 • 1
jayzou3773/qwen3_5-moe-expert_drop-weight_magnitude_pruning-r128-s1k-128samples 19B • Updated Apr 25 • 1
jayzou3773/qwen3_5-moe-expert_drop-pure_gradient_pruning-r128-s1k-128samples 19B • Updated Apr 25 • 1
jayzou3773/qwen3_5-moe-expert_drop-pure_expert_gradient_pruning-r128-s1k-128samples 19B • Updated Apr 25 • 1
jayzou3773/qwen3-moe-expert_drop-weight_magnitude_pruning-r64-s1k-128samples 16B • Updated Apr 25 • 3
jayzou3773/qwen3-moe-expert_drop-pure_expert_gradient_pruning-r64-s1k-128samples 16B • Updated Apr 25 • 1