mtecnic commited on
Commit
4c938b1
·
verified ·
1 Parent(s): 51603d0

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +17 -0
README.md CHANGED
@@ -1,3 +1,20 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  # Research Test: Qwen3-Coder-Next-REAP-AWQ
2
 
3
  > Expert-pruned and AWQ-quantized Qwen3-Coder-Next using the REAP (Robust Efficient Architecture Pruning) pipeline. 20% of MoE experts removed via diverse-calibration saliency analysis, then quantized to W4A16 for efficient inference on consumer GPUs.
 
1
+ ---
2
+ language:
3
+ - en
4
+ license: other
5
+ tags:
6
+ - moe
7
+ - pruning
8
+ - awq
9
+ - quantized
10
+ - qwen3
11
+ - reap
12
+ - expert-pruning
13
+ base_model: Qwen/Qwen3-Coder-Next
14
+ pipeline_tag: text-generation
15
+ library_name: transformers
16
+ ---
17
+
18
  # Research Test: Qwen3-Coder-Next-REAP-AWQ
19
 
20
  > Expert-pruned and AWQ-quantized Qwen3-Coder-Next using the REAP (Robust Efficient Architecture Pruning) pipeline. 20% of MoE experts removed via diverse-calibration saliency analysis, then quantized to W4A16 for efficient inference on consumer GPUs.