File size: 1,764 Bytes
9c0c050 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 |
# Quantization Matrix (CoreML)
This repository publishes only >=8-bit CoreML artifacts. 4-bit variants are
excluded due to quality.
## Naming rules
The folder name encodes the intended runtime and quantization approach:
- `coreml_*`: generic CoreML export.
- `coreml_ios18_*`: tuned for iOS 18 CoreML runtime.
- `int8`: int8 weights for one or more stages.
- `vocoder_only`: only the vocoder is quantized (per naming).
- `both`: multiple stages are quantized (per naming).
- `compressed` / `linear8`: linear 8-bit compression for smaller memory.
## Variant table
| Variant folder | Quantization (by name) | Expected tradeoff | When to use |
| --- | --- | --- | --- |
| `coreml` | full precision (mixed) | best quality, larger | baseline quality checks |
| `coreml_int8` | int8 (all stages) | faster, smaller | general fast inference |
| `coreml_compressed` | linear8 | smallest memory | low-memory devices |
| `coreml_ios18` | full precision (mlprogram) | best quality on iOS 18 | iOS 18+ devices |
| `coreml_ios18_int8_vocoder_only` | int8 (vocoder only) | balanced | iOS 18+ with minimal quality loss |
| `coreml_ios18_int8_both` | int8 (multiple stages) | faster, more loss | iOS 18+ when latency matters |
| `coreml_compressed_ios18` | linear8 (subset) | smallest memory | iOS 18+ with tight memory |
## Steps vs. quality
The `steps` parameter controls the denoiser iterations:
- Fewer steps = faster, lower fidelity.
- More steps = slower, higher fidelity.
Recommended starting points:
- **Fast preview:** 10 steps
- **Balanced:** 20 steps
- **Higher quality:** 30 steps
## Excluded variants
The following are intentionally not published:
- `coreml_ios18_int4_only`
- `coreml_ios18_int4_int8`
- any package with `int4` or `linear4` in its filename
|