FluffyVoices — ONNX models

ONNX model pack for FluffyVoices, a real-time AI voice changer for Windows (zero-shot voice cloning).

Derived from the X-VC checkpoint by Jerrister Zheng, fine-tuned on Polish speech (Common Voice PL). These weights inherit the licenses of those upstream projects; the FluffyVoices app code itself is MIT.

Usage

Download this whole repository and place it as a models/ folder next to FluffyVoices.exe (see the app Releases page).

sac_encoder.onnx / sac_decoder.onnx — acoustic codec (dynamic time axis)
semantic_tokenizer*.onnx — GLM-4-Voice-based semantic tokenizer (fixed windows: 480/1200/1760/2400 ms)
converter*.onnx — X-VC voice converter (fixed windows, dynamic reference axis)
speaker_encoder.onnx — ERes2Net speaker embedding
pipeline_config.json + assets/ — DSP contract (mel filterbanks, windows)

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

JohnBlvck
/

FluffyVoices

FluffyVoices — ONNX models

Usage

Contents