Add 23 pre-converted voice GGUFs (DE/EN/FR/IN/IT/JP/KR/NL/PL/PT/SP)

by lokegud - opened May 5

base: refs/heads/main

←

from: refs/pr/1

Discussion Files changed

+92

-0

lokegud

May 5

Adds the 23 voice GGUFs not currently in this repo, pre-converted from the upstream Microsoft VibeVoice demo voices in microsoft/VibeVoice/demo/voices/streaming_model/.

Repo previously had: voice-en-Carter_man.gguf, voice-en-Emma.gguf. After this PR, all 25 voices ship pre-converted - anyone using vibevoice.cpp can hf download mudler/vibevoice.cpp-models --local-dir models and have every voice ready, no torch install or conversion script run required.

Voices added

EN (4): Davis (m), Frank (m), Grace (w), Mike (m)
DE: Spk0 (m), Spk1 (w)
FR: Spk0 (m), Spk1 (w)
IT: Spk0 (w), Spk1 (m)
SP: Spk0 (w), Spk1 (m)
PT: Spk0 (w), Spk1 (m)
NL: Spk0 (m), Spk1 (w)
PL: Spk0 (m), Spk1 (w)
JP: Spk0 (m), Spk1 (w)
KR: Spk0 (w), Spk1 (m)
IN: Samuel (m)

Total ~176 MB across 23 files.

Conversion details

Script: scripts/convert_voice_to_gguf.py from mudler/vibevoice.cpp (this repo's tooling project)
Source: https://github.com/microsoft/VibeVoice/raw/main/demo/voices/streaming_model/<name>.pt
Smoke-tested four (PL-Spk1, FR-Spk1, JP-Spk1, IN-Samuel) end-to-end with English text + the realtime-0.5B-q8_0 model - all produce valid 24kHz mono PCM WAVs. Foreign-language voices speak English with their native accent (as expected - the voice GGUF carries timbre/prosody, the model handles language).

Naming convention

Kept upstream's <lang>-<name>_<gender>.pt -> voice-<lang>-<name>_<gender>.gguf. Note the existing voice-en-Emma.gguf in this repo dropped the _woman suffix; if you'd like all files normalized one way (with or without gender suffix), happy to follow up.

Licensing

Voices are part of the official Microsoft VibeVoice demo distribution (MIT). Conversion tooling here is MIT. Conversion is straightforward derivative work; no new license obligations introduced.

Add 23 pre-converted voice GGUFs (DE/EN/FR/IN/IT/JP/KR/NL/PL/PT/SP)a72a041c

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

Ready to merge

This branch is ready to get merged automatically.

· Sign up or log in to comment