CosyVoice3 Speech Tokenizer PT Parameter Bundle

This repository stores a PyTorch checkpoint converted from ONNX initializers.

Source

Important Notes

  • This repository supports trust_remote_code=True loading.
  • This is a parameter-store style model bundle, not a full original training architecture reconstruction.
  • configuration_onnx_parameter_store.py, modeling_onnx_parameter_store.py, config.json, onnx_parameter_map.json are required for AutoModel.from_pretrained(..., trust_remote_code=True).

Load Example

from transformers import AutoModel

model = AutoModel.from_pretrained(
    "wookee3/cosyvoice3-speech-tokenizer-pt",
    trust_remote_code=True,
)
print(len(model.weights))

Tokenize via Loaded HF Model

from transformers import AutoModel

model = AutoModel.from_pretrained(
    "wookee3/cosyvoice3-speech-tokenizer-pt",
    trust_remote_code=True,
)
# downloads speech_tokenizer_v3.onnx from source repo automatically
indices = model.tokenize_from_file("/path/to/audio.wav")
print(indices.shape)
print(indices[0].tolist()[:30])

Audio -> Token Sequence Example (ONNX Runtime)

This repo does not include ONNX file intentionally. The example script downloads speech_tokenizer_v3.onnx from source repo at runtime.

pip install numpy soundfile onnxruntime huggingface_hub
python audio_to_tokens_example.py --audio /path/to/audio.wav

Or specify a local ONNX path:

python audio_to_tokens_example.py \
  --audio /path/to/audio.wav \
  --onnx /path/to/speech_tokenizer_v3.onnx
Downloads last month
59
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for wookee3/cosyvoice3-speech-tokenizer-pt

Finetuned
(8)
this model