Sad-tts-v1 (Vocence PromptTTS)
Sad-tts-v1 is a Vocence SN78 miner bundle (OmniVoice backbone). Hub: KGSS/Sad-tts-v1 — natural-language instruction plus text in, mono WAV out via miner.py.
Base weights align with k2-fsa/OmniVoice · Apache-2.0 · omnivoice runtime. Pin commits in miner_deploy_sad_tts_v1.py and VOCENCE_HF.md.
Vocence contract
| Method | Role |
|---|---|
Miner(path_hf_repo) |
Load checkpoint from a directory (or HF snapshot) containing config.json, model.safetensors, tokenizers, and audio_tokenizer/. |
warmup() |
One short synthesis to prime the stack. |
generate_wav(instruction, text) |
Returns (float32 mono ndarray, sample_rate); typically 24 kHz. |
Validators send free-form instruction. OmniVoice voice-design only accepts whitelisted attribute tags; miner.py maps keywords (gender, age, pitch, whisper, accent, Chinese dialects) to those tags. Unmatched instructions fall back to runtime.default_instruct in vocence_config.yaml.
Repo layout
| File | Role |
|---|---|
miner.py |
Engine + NL → instruct mapping |
chute_config.yml |
Chutes image (PyTorch cu128 + omnivoice) |
vocence_config.yaml |
Limits, default voice tags, num_step / guidance_scale |
| Weight files | Shipped in this Hub repo (model.safetensors, audio_tokenizer/); see VOCENCE_HF.md |
Local quick check (GPU)
pip install omnivoice torch torchaudio # match CUDA index from chute_config.yml
# Copy snapshot: huggingface-cli download k2-fsa/OmniVoice --local-dir ./OmniVoice_weights
python -c "
from pathlib import Path
from miner import Miner
m = Miner(Path('./OmniVoice_weights'))
m.warmup()
w, sr = m.generate_wav('Calm female voice, British accent.', 'Hello from OmniVoice on Vocence.')
print(w.shape, sr)
"
License
Apache-2.0 for this packaging layout and miner glue; OmniVoice weights and upstream code remain under their stated licenses on the model card.
- Downloads last month
- 72