Experimental global target bits‑per‑weight quantization of Octen/Octen-Embedding-0.6B
- Using non-standard (forked) LLaMA C++ branch for quantization.
- Using a CLI tool to build KLD evaluation and imatrix calibration datasets for GGUF models, sourced from eaddario/imatrix-calibration.
- Using dataset sources: text_en, text_ru.
- Using dataset chunks: 750.
- Small set of patches added.
- Tensors quantinization F16 instead of BF16, Nvidia Pascal architecture friendly like P100.
- Small set of patches added.
Many thanks to Ed Addario for an impressive job.
Quantization comparison
| BPW/TGS | PPL correlation | PPL mean ratio | ΔPPL | Mean KLD | Maximum KLD | 99.9% KLD | Mean Δp | RMS Δp |
|---|---|---|---|---|---|---|---|---|
| 3.50 | 85.56% | 1.328184 ± 0.006198 | 113.055840 ± 2.030409 | 1.738910 ± 0.002968 | 19.720037 | 10.812436 | -3.331 ± 0.037 % | 17.941 ± 0.063 % |
| 4.00 | 93.58% | 1.360601 ± 0.004364 | 124.223093 ± 1.754530 | 0.787587 ± 0.001566 | 19.145452 | 6.742295 | -1.688 ± 0.027 % | 12.813 ± 0.052 % |
| 4.50 | 95.88% | 1.235695 ± 0.003178 | 81.194364 ± 1.267961 | 0.492418 ± 0.001110 | 16.026636 | 5.010898 | -1.135 ± 0.022 % | 10.330 ± 0.047 % |
| 5.00 | 97.12% | 1.225914 ± 0.002645 | 77.825006 ± 1.132301 | 0.323054 ± 0.000773 | 14.785572 | 3.622450 | -1.015 ± 0.018 % | 8.707 ± 0.042 % |
| 5.50 | 98.61% | 1.140659 ± 0.001722 | 48.455459 ± 0.754229 | 0.142510 ± 0.000361 | 10.251470 | 1.794577 | -0.500 ± 0.012 % | 5.834 ± 0.031 % |
| 6.00 | 99.03% | 1.095986 ± 0.001385 | 33.066115 ± 0.583789 | 0.089186 ± 0.000256 | 9.821702 | 1.338731 | -0.228 ± 0.010 % | 4.639 ± 0.028 % |
| 6.50 | 99.28% | 1.086861 ± 0.001185 | 29.922486 ± 0.515390 | 0.056230 ± 0.000156 | 9.794716 | 0.742507 | -0.200 ± 0.008 % | 3.695 ± 0.022 % |
| 7.00 | 99.47% | 1.032955 ± 0.000969 | 11.352594 ± 0.361050 | 0.033634 ± 0.000093 | 3.941931 | 0.450176 | -0.041 ± 0.006 % | 2.933 ± 0.020 % |
| 7.50 | 99.53% | 1.015463 ± 0.000891 | 5.326939 ± 0.316756 | 0.024763 ± 0.000071 | 4.088861 | 0.356147 | 0.027 ± 0.005 % | 2.558 ± 0.019 % |
| 8.00 | 99.56% | 1.012318 ± 0.000866 | 4.243431 ± 0.305956 | 0.021680 ± 0.000059 | 1.800071 | 0.296283 | 0.037 ± 0.005 % | 2.380 ± 0.015 % |
| 8.50 | 99.63% | 1.020174 ± 0.000803 | 6.949556 ± 0.292500 | 0.013173 ± 0.000038 | 3.675829 | 0.164360 | -0.008 ± 0.004 % | 1.903 ± 0.015 % |
| 9.00 | 99.64% | 1.017785 ± 0.000789 | 6.126855 ± 0.285191 | 0.011793 ± 0.000035 | 3.644340 | 0.155200 | 0.001 ± 0.004 % | 1.822 ± 0.015 % |
| 9.50 | 99.64% | 1.023754 ± 0.000790 | 8.182888 ± 0.292987 | 0.011307 ± 0.000036 | 3.703608 | 0.149881 | -0.017 ± 0.004 % | 1.799 ± 0.017 % |
| 10.00 | 99.64% | 1.026870 ± 0.000790 | 9.256477 ± 0.297126 | 0.010960 ± 0.000039 | 4.368244 | 0.146243 | -0.023 ± 0.004 % | 1.781 ± 0.017 % |
| 10.50 | 99.65% | 1.031793 ± 0.000791 | 10.952329 ± 0.305078 | 0.010631 ± 0.000033 | 2.633135 | 0.139757 | -0.033 ± 0.004 % | 1.756 ± 0.016 % |
| 11.00 | 99.65% | 1.032131 ± 0.000785 | 11.068814 ± 0.303826 | 0.010088 ± 0.000026 | 1.066854 | 0.125272 | -0.039 ± 0.004 % | 1.698 ± 0.010 % |
| 11.50 | 99.66% | 1.034359 ± 0.000785 | 11.836127 ± 0.307385 | 0.009816 ± 0.000026 | 1.311119 | 0.122771 | -0.039 ± 0.004 % | 1.685 ± 0.010 % |
| 12.00 | 99.66% | 1.033216 ± 0.000782 | 11.442555 ± 0.304820 | 0.009535 ± 0.000028 | 2.683337 | 0.118959 | -0.036 ± 0.004 % | 1.669 ± 0.015 % |
| 12.50 | 99.66% | 1.035454 ± 0.000782 | 12.213620 ± 0.308823 | 0.009296 ± 0.000026 | 1.888097 | 0.117823 | -0.036 ± 0.003 % | 1.653 ± 0.011 % |
| 13.00 | 99.66% | 1.035315 ± 0.000779 | 12.165472 ± 0.307315 | 0.009015 ± 0.000024 | 1.563708 | 0.112765 | -0.037 ± 0.003 % | 1.641 ± 0.012 % |
| 13.50 | 99.66% | 1.035511 ± 0.000777 | 12.233185 ± 0.307197 | 0.008828 ± 0.000027 | 2.020073 | 0.110392 | -0.042 ± 0.003 % | 1.634 ± 0.016 % |
| 14.00 | 99.67% | 1.034812 ± 0.000775 | 11.992309 ± 0.305653 | 0.008529 ± 0.000025 | 1.908343 | 0.105182 | -0.034 ± 0.003 % | 1.617 ± 0.017 % |
| 14.50 | 99.66% | 1.035023 ± 0.000780 | 12.064932 ± 0.307359 | 0.008970 ± 0.000023 | 0.820967 | 0.111618 | -0.034 ± 0.003 % | 1.618 ± 0.010 % |
| 15.00 | 99.66% | 1.035123 ± 0.000777 | 12.099435 ± 0.306609 | 0.008687 ± 0.000022 | 1.018949 | 0.102148 | -0.033 ± 0.003 % | 1.612 ± 0.010 % |
- Downloads last month
- 10,362
Hardware compatibility
Log In to add your hardware
We're not able to determine the quantization variants.
Model tree for ENOSYS/Octen-Embedding-0.6B-750-v1-GGUF
Base model
Qwen/Qwen3-0.6B-Base Finetuned
Qwen/Qwen3-Embedding-0.6B Finetuned
Octen/Octen-Embedding-0.6B