Instructions to use mlboydaisuke/DAC-16kHz-LiteRT with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- LiteRT
How to use mlboydaisuke/DAC-16kHz-LiteRT with LiteRT:
# No code snippets available yet for this library. # To use this model, check the repository files and the library's documentation. # Want to help? PRs adding snippets are welcome at: # https://github.com/huggingface/huggingface.js
- Notebooks
- Google Colab
- Kaggle
DAC (Descript Audio Codec) 16 kHz β LiteRT (CompiledModel GPU)
Descript Audio Codec running on-device on the LiteRT CompiledModel GPU (ML Drift). The convolutional encoder/decoder run on the GPU; the RVQ runs on CPU. 43:1 compression (1 s β 12Γ50 codes), RTF β 0.82 (faster than real-time) on Pixel 8a.
Files
dac_16khz_encoder_fp16.tflite(43 MB) βaudio[1,1,16000]βlatent[1,1024,50], GPU.dac_16khz_deconly_zs_fp16.tflite(105 MB) βlatent[1,1024,50]βaudio, GPU.dac_rvq.bin(1.2 MB) β RVQ weights (12 codebooks) for the CPU quantizer (float32 LE).
Pipeline
audio -> encoder.tflite (GPU) -> z -> RVQ.encode (CPU) -> codes[12,50]
-> RVQ.decode (CPU) -> z_q -> decoder.tflite (GPU) -> audio
On-device (Pixel 8a, Tensor G3 β verified)
encoder 367/367 + decoder 398/398 nodes on the LiteRT GPU delegate (LITERT_CL, 1 partition,
no CPU fallback); warm RTF ~0.82; reconstruction corr 1.0 vs PyTorch DAC.
Why the split
The decoder's ConvTranspose1d are rewritten to a GPU-clean zero-stuff form (the real DAC's
odd stride-5 transposed conv fails converter legalization, and TRANSPOSE_CONV is rejected by Mali).
The RVQ uses EMBEDDING_LOOKUP + int64 indices (Mali-rejected) so it runs on CPU. So the float conv
graph stays fully on the GPU.
Android sample + conversion/validation scripts: https://github.com/john-rocky/LiteRT-Models/tree/main/dac
License: MIT (Descript DAC).
- Downloads last month
- 18