FluffyVoices β€” ONNX models

ONNX model pack for FluffyVoices, a real-time AI voice changer for Windows (zero-shot voice cloning).

Derived from the X-VC checkpoint by Jerrister Zheng, fine-tuned on Polish speech (Common Voice PL). These weights inherit the licenses of those upstream projects; the FluffyVoices app code itself is MIT.

Usage

Download this whole repository and place it as a models/ folder next to FluffyVoices.exe (see the app Releases page).

Contents

  • sac_encoder.onnx / sac_decoder.onnx β€” acoustic codec (dynamic time axis)
  • semantic_tokenizer*.onnx β€” GLM-4-Voice-based semantic tokenizer (fixed windows: 480/1200/1760/2400 ms)
  • converter*.onnx β€” X-VC voice converter (fixed windows, dynamic reference axis)
  • speaker_encoder.onnx β€” ERes2Net speaker embedding
  • pipeline_config.json + assets/ β€” DSP contract (mel filterbanks, windows)
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support