vocence_miner_v8

A naturalness-first prompt-driven TTS, built on top of magma90909/vocence_miner_v8.

Generate

pip install qwen-tts transformers torch soundfile
from qwen_tts import Qwen3TTSModel
import soundfile as sf

m = Qwen3TTSModel.from_pretrained("magma90909/vocence_miner_v8")

wavs, sr = m.generate_voice_design(
    text="The train to Edinburgh departs from platform four.",
    instruct="A man with a British English accent, calm and natural.",
    language="english",
)
sf.write("out.wav", wavs[0], sr)

demo.py walks through three preset prompts.

How to write instruct

The model responds best to subtle, conversational language โ€” not intensifiers like "intensely sad" or "nearly shouting". Stack these elements freely:

Layer Phrasings
Accent / region British English, Scottish, Welsh, Northern Irish, Irish, unspecified
Gender a man, a woman, a British woman
Mood speaking warmly, softly sad, quietly pleased, with a touch of anger
Persona bedtime storyteller, soft and warm; news anchor, professional and neutral; meditation guide, soft and serene
Pace unhurried, brisk steady, naturally measured

Some example prompts that work well:

A British man speaks calmly and naturally.
A woman with a Scottish accent, in an everyday speaking tone.
A man, softly sad, calm and unhurried.
A British news anchor, professional and neutral, at a brisk steady pace.
A clear, neutral voice reading the sentence.

Best-fit and not-fit

Best at:

  • Natural, everyday English โ€” both US and UK
  • Bedtime storyteller / news anchor / meditation guide style reads
  • Conversational sadness, warmth, mild anger, gentle pleasure

Less suited for:

  • Theatrical / caricatured delivery (loud anger, shouted joy, dramatic sadness)
  • Extreme intensifier prompts ("nearly shouting", "intensely sad") โ€” the model intentionally tones these down
  • Languages other than English

CC BY-NC-SA 4.0 โ€” research and non-commercial use only.

Files

model.safetensors            # merged Talker weights (3.6 GB)
speech_tokenizer/            # Qwen3 12 Hz audio codec (~650 MB)
tokenizer.json + ...         # text tokenizer
config.json + ...            # model configs
miner.py                     # Vocence engine
chute_config.yml             # Chutes build (TEE / pro_6000)
vocence_config.yaml          # runtime knobs
demo.py                      # quick smoke test

The Vocence files make this repo deployable on Bittensor SN78 (Vocence) via the canonical Vocence/Chutes wrapper without modification.

Downloads last month
60
Safetensors
Model size
2B params
Tensor type
BF16
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support