WF-Champion: Calibrated Spectral Mixed-Precision LLM Quantization

📄 Champion Paper — Complete benchmark with 750-sample profiling
📄 v2 Paper — 2D Wavelet + Hessian + Golden Layer analysis
📄 Ternary Paper — 2-bit / ternary comprehensive comparison
📄 KV Paper — Spectral energy preservation in KV cache

Complete Benchmark (Qwen2.5-0.5B, A10G)

Method	Bits	PPL	Δ PPL	Tok/s	TTFT	Mem	Size	R-1 vs FP16
FP16	16.0	20.12	—	44.9	36ms	1.00G	0.99G	1.000
WF-Champion (NL)	10.0	20.13	+0.0%	44.9	25ms	1.00G	0.72G	0.978
RTN 8-bit	8.0	20.13	+0.0%	44.9	29ms	1.00G	0.63G	0.858
AutoRound 4-bit	4.0	21.63	+7.5%	24.4	45ms	1.04G	0.45G	0.434
WF-Champion (Med)	5.0	22.63	+12.4%	44.8	25ms	1.00G	0.50G	0.501
BnB NF4	4.0	22.83	+13.4%	22.2	83ms	1.53G	0.36G	0.622
RTN 4-bit	4.0	23.95	+19.0%	45.3	25ms	1.00G	0.45G	0.409
RTN 3-bit	3.0	72.40	+260%	45.1	25ms	1.00G	0.41G	0.205

Key Findings

1. Near-Lossless: Identical to FP16

WF-Champion near-lossless (10-bit avg) achieves PPL=20.13 with 97.8% ROUGE similarity to FP16, at 1.4× compression and 31% faster TTFT.

2. Medium Beats RTN 4-bit

At ~5 effective bits, WF-Champion medium outperforms RTN 4-bit by 5.5% on PPL while maintaining 2× higher throughput than BitsAndBytes NF4.

3. 750-Sample Profiler Finds Non-Obvious Golden Layers

Beyond positional layers (0,1,22,23), the profiler identifies layers 4-5 as golden due to high activation kurtosis (25-66×), indicating outlier token patterns that require high precision.

4. Zero Runtime Overhead

Unlike BnB NF4 (+131% TTFT) or AutoRound (+25% TTFT), WF-Champion has zero inference overhead — quantized weights are standard integer tensors.

5. AutoRound Wins PPL-per-Bit, WF-Champion Wins Speed+Fidelity

Metric	WF-Champion (Med)	AutoRound 4-bit
PPL	22.63	21.63
Tok/s	44.8 (1.8×)	24.4
TTFT	25ms (1.8×)	45ms
R-1	0.501	0.434

Quality Tiers

Tier	Avg Bits	Golden Layers	Compression	Use Case
Near-lossless	10	@16-bit	1.4×	Maximum quality
High	5	@8-bit	2.0×	Production deployment
Medium	5	@8-bit	2.0×	Fast inference
Low	3.5	@8-bit	2.3×	Maximum compression

Quick Start

pip install torch transformers accelerate datasets PyWavelets auto-round bitsandbytes rouge-score
python champion_benchmark.py  # Full benchmark (~30 min on A10G)

Files

File	Description
`champion_benchmark.py`	WF-Champion: profiling + mixed precision + full benchmark
`v2_benchmark.py`	WFIQ-SR v2 + golden layer experiments
`ternary_benchmark.py`	2-bit / ternary comparison
`comprehensive_benchmark.py`	KV compression + energy analysis
`tuning_sweep.py`	WaveletFourier hyperparameter sweep
`results/`	All JSON results

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support