metadata
tags:
- wavtokenizer
- audio
- tokenizer
- speech
- tts
- codec
- audio-codec
license: mit
library_name: pytorch
WavTokenizer-large-speech-320-v2
WavTokenizer Large Speech Model v2 - 320 dim
Installation
pip install codecplus
Usage
from codecplus import load_codec
from codecplus.utils import load_audio, save_audio
# Load audio
audio, sr = load_audio('input.wav')
# WavTokenizer
tokenizer = load_codec('wav_tokenizer')
tokens = tokenizer.encode(audio)
output = tokenizer.decode(tokens)
# Save output
save_audio(output, 'output.wav', sr)
Model Details
- Source Repository: Vyvo-Research/WavTokenizer
- Original File:
wavtokenizer_large_speech_320_v2.ckpt - Architecture: WavTokenizer
- License: MIT
Features
- High-quality audio tokenization
- Efficient compression
- Suitable for TTS and audio generation tasks
- Low-latency encoding/decoding
Citation
@misc{wavtokenizer2024,
title={WavTokenizer: Efficient Audio Tokenization},
author={Vyvo Research Team},
year={2024},
url={https://github.com/Vyvo-ai/WavTokenizer}
}