kadirnar's picture
Upload README.md with huggingface_hub
5a6b9c5 verified
metadata
tags:
  - wavtokenizer
  - audio
  - tokenizer
  - speech
  - tts
  - codec
  - audio-codec
license: mit
library_name: pytorch

WavTokenizer-large-speech-320-v2

WavTokenizer Large Speech Model v2 - 320 dim

Installation

pip install codecplus

Usage

from codecplus import load_codec
from codecplus.utils import load_audio, save_audio

# Load audio
audio, sr = load_audio('input.wav')

# WavTokenizer
tokenizer = load_codec('wav_tokenizer')
tokens = tokenizer.encode(audio)
output = tokenizer.decode(tokens)

# Save output
save_audio(output, 'output.wav', sr)

Model Details

  • Source Repository: Vyvo-Research/WavTokenizer
  • Original File: wavtokenizer_large_speech_320_v2.ckpt
  • Architecture: WavTokenizer
  • License: MIT

Features

  • High-quality audio tokenization
  • Efficient compression
  • Suitable for TTS and audio generation tasks
  • Low-latency encoding/decoding

Citation

@misc{wavtokenizer2024,
  title={WavTokenizer: Efficient Audio Tokenization},
  author={Vyvo Research Team},
  year={2024},
  url={https://github.com/Vyvo-ai/WavTokenizer}
}