|
|
--- |
|
|
license: apache-2.0 |
|
|
base_model: |
|
|
- neuphonic/neucodec |
|
|
pipeline_tag: audio-to-audio |
|
|
--- |
|
|
|
|
|
# NeuCodec Onnx Decoder |
|
|
|
|
|
This is an onnx-compiled version of the decoder of [NeuCodec](https://huggingface.co/neuphonic/neucodec). |
|
|
It's main use case is providing a low footprint decoder for on-device TTS. |
|
|
|
|
|
## Usage |
|
|
```python |
|
|
import librosa |
|
|
import torch |
|
|
import soundfile as sf |
|
|
from torchaudio import transforms as T |
|
|
from neucodec import NeuCodec, NeuCodecOnnxDecoder |
|
|
|
|
|
model = NeuCodec.from_pretrained("neuphonic/neucodec") |
|
|
model.eval() |
|
|
compiled_model = NeuCodecOnnxDecoder.from_pretrained("neuphonic/neucodec-onnx-decoder") |
|
|
|
|
|
y, sr = torchaudio.load(librosa.ex("libri1")) |
|
|
if sr != 16_000: |
|
|
y = T.Resample(sr, 16_000)(y)[None, ...] # (B, 1, T_16) |
|
|
|
|
|
with torch.no_grad(): |
|
|
fsq_codes = model.encode_code(y) |
|
|
# fsq_codes = model.encode_code(librosa.ex("libri1")) # or directly pass your filepath! |
|
|
print(f"Codes shape: {fsq_codes.shape}") |
|
|
recon = compiled_model.decode_code(fsq_codes) # (B, 1, T_24) |
|
|
|
|
|
sf.write("reconstructed.wav", recon, 24_000) |
|
|
``` |