lewm-models / docs /model-comparison.md
eren23
Initial: LeWM model collection with full quantization documentation
6cdcc30

Model Comparison

All Variants

Model Latent Enc Pred Params f32 Size Quantized Cos ESP32 predict ESP32 encode Best For
baseline 192 6 6 ~14M 54.6 MB 10.9 MB (INT8+Q4) 0.999 828 ms ~10,000 ms Quality reference
slim_48d 48 2 2 ~3M ~2 MB ~1 MB pending ~300 ms* ~3,000 ms* Tiny edge
slim_64d 64 3 3 ~5M ~3 MB ~2 MB pending ~400 ms* ~4,000 ms* Small edge
slim_96d 96 2 3 ~8M ~3.5 MB ~2 MB pending ~400 ms* ~4,000 ms* Balanced
slim_96d 96 4 4 ~10M 36.8 MB 9.8 MB (INT8+Q4) 0.9982 583 ms 6,416 ms Production
slim_128d 128 4 4 ~12M ~5 MB ~3 MB pending ~500 ms* ~5,000 ms* Quality bias
slim_192d 192 4 4 ~13M ~40 MB ~12 MB pending ~600 ms* ~6,000 ms* Layer depth
hybrid_ALAL 64 4 4 3.0M ~12 MB 3.9 MB (LQ40) pending 152 ms 922 ms* Max compression
elastic 64 4 4 ~10M ~4 MB ~2 MB pending ~400 ms* ~4,000 ms* Truncatable
baseline Q4 pred 192 6 6 ~14M 54.6 MB 23.6 MB 0.998 828 ms ~10,000 ms Large edge
WANDA 20% 192 6 6 ~14M 54.6 MB 22.0 MB ~0.99 ~660 ms* ~10,000 ms Pruning research
WANDA 40% 192 6 6 ~14M 54.6 MB 25.1 MB ~0.97 ~500 ms* ~10,000 ms Max sparsity

* Projected — not yet benchmarked on hardware

Quantization Formats

Format Encoder Predictor Compression Quality Hardware
f32 f32 f32 1x 1.000 All
INT8 INT8 per-channel f32 ~2x 0.9999 ESP32, host
INT8+Q4 INT8 per-channel Q4 block ~5x 0.999 Production
Q4 pred only f32 Q4 block ~2x 0.998 Large edge
Full Q4 Q4 block Q4 block ~6x 0.93 Research
Ternary - {-1,0,+1} ~8x ~0.85 Experimental
WANDA pruned - Q4 + sparse ~0.8x ~0.97 Research

Hardware Tiers

Tier Model Format Size Use Case
ESP32-P4 hybrid_ALAL, slim_96d LQ40 3.9-9.8 MB Edge robotics
Browser WASM slim_96d LQ40 9.8 MB Client-side demos
Host CPU any safetensors 2-54 MB Development
FPGA baseline, slim Q4 → hardwired 0 MB (gates) Custom silicon
ASIC any Q4 shift-add 0 MB Mass production

Pareto Frontier

Size (MB)
    ^
 9.8 |                                    ● slim_96d (INT8+Q4)
    |                     ● slim_96d full
 8   |
 7   |
 6   |
 5   |
 4   |   ● hybrid_ALAL (3.9 MB)
    |
 3   |
 2   |
 1   |
    +----------------------------------------> Quality (cos vs f32)
       0.85    0.90    0.95    0.99    1.00