WavCochCausalV8192-vocoder-randinit
WavCoch is a causal waveform-to-cochleagram tokenizer by Greta Tuckute and Klemen Kotar.
This repository contains a freshly initialized WavCochV8192CausalConfig model with a bundled random-initialized vocoder. The weights are random and have not been trained from a checkpoint.
Model Details
| Parameter | Value |
|---|---|
| Parameters | ~24.42M |
| Window Size | 1001 |
| Hop Length | 80 |
| Encoder Dim | 512 |
| Vocabulary Size | 8192 |
| Includes Vocoder | True |
Usage
from transformers import AutoModel
wavcoch = AutoModel.from_pretrained(
"TuKoResearch/WavCochCausalV8192-vocoder-randinit",
trust_remote_code=True,
)
codes = wavcoch.quantize(waveform_tensor)
coch = wavcoch.decode(codes)
audio = wavcoch.decode_audio(codes)
Notes
This repo includes a bundled vocoder and supports decode_audio(...) for end-to-end waveform synthesis.
- Downloads last month
- 12