WF-Champion: Calibrated Spectral Mixed-Precision LLM Quantization
π Champion Paper β Complete benchmark with 750-sample profiling
π v2 Paper β 2D Wavelet + Hessian + Golden Layer analysis
π Ternary Paper β 2-bit / ternary comprehensive comparison
π KV Paper β Spectral energy preservation in KV cache
Complete Benchmark (Qwen2.5-0.5B, A10G)
| Method | Bits | PPL | Ξ PPL | Tok/s | TTFT | Mem | Size | R-1 vs FP16 |
|---|---|---|---|---|---|---|---|---|
| FP16 | 16.0 | 20.12 | β | 44.9 | 36ms | 1.00G | 0.99G | 1.000 |
| WF-Champion (NL) | 10.0 | 20.13 | +0.0% | 44.9 | 25ms | 1.00G | 0.72G | 0.978 |
| RTN 8-bit | 8.0 | 20.13 | +0.0% | 44.9 | 29ms | 1.00G | 0.63G | 0.858 |
| AutoRound 4-bit | 4.0 | 21.63 | +7.5% | 24.4 | 45ms | 1.04G | 0.45G | 0.434 |
| WF-Champion (Med) | 5.0 | 22.63 | +12.4% | 44.8 | 25ms | 1.00G | 0.50G | 0.501 |
| BnB NF4 | 4.0 | 22.83 | +13.4% | 22.2 | 83ms | 1.53G | 0.36G | 0.622 |
| RTN 4-bit | 4.0 | 23.95 | +19.0% | 45.3 | 25ms | 1.00G | 0.45G | 0.409 |
| RTN 3-bit | 3.0 | 72.40 | +260% | 45.1 | 25ms | 1.00G | 0.41G | 0.205 |
Key Findings
1. Near-Lossless: Identical to FP16
WF-Champion near-lossless (10-bit avg) achieves PPL=20.13 with 97.8% ROUGE similarity to FP16, at 1.4Γ compression and 31% faster TTFT.
2. Medium Beats RTN 4-bit
At ~5 effective bits, WF-Champion medium outperforms RTN 4-bit by 5.5% on PPL while maintaining 2Γ higher throughput than BitsAndBytes NF4.
3. 750-Sample Profiler Finds Non-Obvious Golden Layers
Beyond positional layers (0,1,22,23), the profiler identifies layers 4-5 as golden due to high activation kurtosis (25-66Γ), indicating outlier token patterns that require high precision.
4. Zero Runtime Overhead
Unlike BnB NF4 (+131% TTFT) or AutoRound (+25% TTFT), WF-Champion has zero inference overhead β quantized weights are standard integer tensors.
5. AutoRound Wins PPL-per-Bit, WF-Champion Wins Speed+Fidelity
| Metric | WF-Champion (Med) | AutoRound 4-bit |
|---|---|---|
| PPL | 22.63 | 21.63 |
| Tok/s | 44.8 (1.8Γ) | 24.4 |
| TTFT | 25ms (1.8Γ) | 45ms |
| R-1 | 0.501 | 0.434 |
Quality Tiers
| Tier | Avg Bits | Golden Layers | Compression | Use Case |
|---|---|---|---|---|
| Near-lossless | 10 | @16-bit | 1.4Γ | Maximum quality |
| High | 5 | @8-bit | 2.0Γ | Production deployment |
| Medium | 5 | @8-bit | 2.0Γ | Fast inference |
| Low | 3.5 | @8-bit | 2.3Γ | Maximum compression |
Quick Start
pip install torch transformers accelerate datasets PyWavelets auto-round bitsandbytes rouge-score
python champion_benchmark.py # Full benchmark (~30 min on A10G)
Files
| File | Description |
|---|---|
champion_benchmark.py |
WF-Champion: profiling + mixed precision + full benchmark |
v2_benchmark.py |
WFIQ-SR v2 + golden layer experiments |
ternary_benchmark.py |
2-bit / ternary comparison |
comprehensive_benchmark.py |
KV compression + energy analysis |
tuning_sweep.py |
WaveletFourier hyperparameter sweep |
results/ |
All JSON results |