AesSedai commited on
Commit
1e997fe
·
verified ·
1 Parent(s): 8af06f0

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -1
README.md CHANGED
@@ -12,7 +12,9 @@ This repo contains specialized MoE-quants for MiniMax-M2.5. The idea being that
12
  | IQ4_XS | 101.10 GiB (3.80 BPW) | Q8_0 / IQ3_S / IQ3_S / IQ4_XS | 7.513587 ± 0.122746 | +6.0549% | 0.095077 ± 0.002168 |
13
  | IQ3_S | 78.76 GiB (2.96 BPW) | Q8_0 / IQ2_S / IQ2_S / IQ3_S | 8.284882 ± 0.135705 | +16.9418% | 0.244096 ± 0.004148 |
14
 
15
- Provided here as well as a couple of graphs showing the Pareto frontier for KLD and PPL for my quants vs Unsloth. Full graphs of all of the quants are available in the `kld_data` directory.
 
 
16
 
17
  While the PPL between the quant methods is similar, I feel like the KLD of the quants provided here are slightly better and that these quants will offer better long context performance due to keeping the default type as Q8_0. This comes with a slight performance penalty in PP / TG due to the higher quality quantization but I think the tradeoff is worthwhile.
18
 
 
12
  | IQ4_XS | 101.10 GiB (3.80 BPW) | Q8_0 / IQ3_S / IQ3_S / IQ4_XS | 7.513587 ± 0.122746 | +6.0549% | 0.095077 ± 0.002168 |
13
  | IQ3_S | 78.76 GiB (2.96 BPW) | Q8_0 / IQ2_S / IQ2_S / IQ3_S | 8.284882 ± 0.135705 | +16.9418% | 0.244096 ± 0.004148 |
14
 
15
+ Provided here as well as a couple of graphs showing the Pareto frontier for KLD and PPL for my quants vs Unsloth.
16
+
17
+ Full graphs of all of the quants are available in the `kld_data` directory, as well as the raw data broken down per quant as well as a CSV with the collated data.
18
 
19
  While the PPL between the quant methods is similar, I feel like the KLD of the quants provided here are slightly better and that these quants will offer better long context performance due to keeping the default type as Q8_0. This comes with a slight performance penalty in PP / TG due to the higher quality quantization but I think the tradeoff is worthwhile.
20