aufklarer's picture
INT8 prefill: 7.7 GB FP32 β†’ 1.9 GB INT8 weight-only; audio_encoder/decoder stay FP32 (ai_edge_quantizer rejects Conv ops). Total bundle 10 GB β†’ 4.3 GB, runtime working set 13 GiB β†’ ~5-6 GiB.
9ec65a0 verified