Mad-tts-v1 (Vocence PromptTTS, English Chatterbox)

Vocence SN78 miner for ResembleAI/chatterbox (English Chatterbox). Validators send English PromptTTS tasks (LibriVox English source audio). The instruction field uses the canonical pipe-separated trait spec from Vocence scoring (gender, pitch, speed, age_group, emotion, tone, accent with values like us / uk / au / in / neutral / other). text is what gets spoken.

Builtin voice only (conds.pt, no audio_prompt_path): prosody follows instruction via mapped exaggeration / CFG / temperature / repetition_penalty. Ordinals on gender, accent, age_group are hard to satisfy without a reference clip—expect strongest alignment on emotion, tone, speed, pitch.

Contract

Method Role
Miner(path_hf_repo) Loads ChatterboxTTS.from_local(repo)
warmup() Short synthesis
generate_wav(instruction, text) (float32 mono, sample_rate)24 kHz

Required files (checkpoint root)

Aligned with chatterbox.tts.ChatterboxTTS.from_local:

  • ve.safetensors, t3_cfg.safetensors, s3gen.safetensors, tokenizer.json, conds.pt

Before pushing your own Hub repo, repack the three .safetensors files so their LFS blobs differ from an unmodified ResembleAI snapshot (subnet duplicate detection). Tensors stay identical; only __metadata__ / header changes:

python patch_chatterbox_safetensors_nonce.py
# or per file: python ../../scripts/patch_safetensors_hf_metadata.py --in-place --nonce ve.safetensors

Sync from Hub or your local download, e.g.:

rsync -a --delete /path/to/hf_models/chatterbox/ ./Chatterbox-tts-v1/
# or: huggingface-cli download ResembleAI/chatterbox --local-dir ./Chatterbox-tts-v1

Other Hub files (t3_mtl*.safetensors, etc.) are for multilingual builds; this bundle uses the English path only.

Deploy (Chutes)

From the Vocence repo root use miner_deploy_mad_tts_v1.py or scripts/publish_mad_tts_v1_to_hf.py to create the Hub repo and pin VOCENCE_REVISION. Chute id: vocence-mad-tts-v1. See VOCENCE_HF.md.

Local quick check (GPU)

pip install chatterbox-tts torch torchaudio  # CUDA wheel from chute_config.yml
python -c "
from pathlib import Path
from miner import Miner
m = Miner(Path('/path/to/chatterbox-weights'))
m.warmup()
w, sr = m.generate_wav('Calm, slightly slow delivery.', 'Hello from Chatterbox on Vocence.')
print(w.shape, sr)
"

License

Miner packaging is MIT-friendly per upstream; see the Chatterbox model card for Resemble AI terms and watermarking.

Downloads last month
103
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support