Text-to-Speech
Transformers
Safetensors
Qwen3-TTS
English
text-generation
tts
qwen3-tts
voice-design
prompttts
vocence
bittensor
Instructions to use aiseosae/minerTTS with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use aiseosae/minerTTS with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-to-speech", model="aiseosae/minerTTS")# Load model directly from transformers import AutoModelForSeq2SeqLM model = AutoModelForSeq2SeqLM.from_pretrained("aiseosae/minerTTS", dtype="auto") - Notebooks
- Google Colab
- Kaggle
| license: cc-by-nc-sa-4.0 | |
| base_model: Qwen/Qwen3-TTS-12Hz-1.7B-VoiceDesign | |
| pipeline_tag: text-to-speech | |
| library_name: transformers | |
| language: | |
| - en | |
| tags: | |
| - tts | |
| - qwen3-tts | |
| - voice-design | |
| - prompttts | |
| - vocence | |
| - bittensor | |
| Inference uses **`qwen_tts.Qwen3TTSModel`**, loaded from the repo root via `from_pretrained(this_folder)`. | |
| ## Layout | |
| | Path | Role | | |
| |------|------| | |
| | `config.json`, weights, tokenizer, codec dirs | Qwen3-TTS snapshot (as shipped by the upstream model card) | | |
| | `miner.py` | Vocence engine: `Miner`, `warmup()`, `generate_wav(instruction, text)` | | |
| | `vocence_config.yaml` | Device, dtype, caps, language | | |
| | `chute_config.yml` | Chutes image / GPU / scaling / TEE | | |
| | `demo.py` | Optional local smoke test (if present) | | |
| ## Vocence API | |
| Validators call your deployed chute with JSON shaped like: | |
| ```json | |
| { | |
| "text": "Words to speak.", | |
| "instruction": "gender: male | pitch: mid | speed: normal | age_group: adult | emotion: neutral | tone: casual | accent: us" | |
| } | |
| ``` | |
| The miner forwards **`text`** → `generate_voice_design(..., text=...)` and **`instruction`** → `instruct=...`, using **`language`** from config (default English). | |
| ## Configure (`vocence_config.yaml`) | |
| | Area | Keys | | |
| |------|------| | |
| | Runtime | `device_preference` (`cuda` / `cpu`), `dtype` (`bfloat16` / `float32`), `use_flash_attention_2`, `default_language` | | |
| | Generation | `sample_rate` (e.g. 24000), `max_seconds` | | |
| | Limits | `max_text_chars`, `max_instruction_chars`, `default_language` | | |
| Warmup runs one short `generate_voice_design` with a **180 s** timeout. | |
| ## Local quick test | |
| Install PyTorch (CUDA if available), then: | |
| ```bash | |
| pip install "qwen-tts" pyyaml soundfile numpy | |
| ``` | |
| ```python | |
| from pathlib import Path | |
| from miner import Miner | |
| miner = Miner(Path(".")) | |
| miner.warmup() | |
| wave, sr = miner.generate_wav( | |
| instruction="A calm, clear narrator, neutral US accent.", | |
| text="Hello — this is a short synthesis check.", | |
| ) | |
| ``` | |
| Or load the class directly from transformers-style layout: | |
| ```python | |
| from qwen_tts import Qwen3TTSModel | |
| model = Qwen3TTSModel.from_pretrained(".") # or your HF repo id | |
| wavs, sr = model.generate_voice_design( | |
| text="Hello fellas.", | |
| instruct="Cute voice.", | |
| language="english", | |
| ) | |
| ``` | |
| Replace `"."` with your HF repo id after upload, e.g. `"your-org/your-repo"`. | |
| ## Chutes / Vocence deploy | |
| 1. Push this layout to a Hugging Face **model** repo; pin a **commit SHA** for `VOCENCE_REVISION`. | |
| 2. Render the canonical Vocence chute script with `VOCENCE_REPO`, `VOCENCE_REVISION`, `VOCENCE_CHUTES_USER`, `VOCENCE_CHUTE_ID`. | |
| 3. `chutes build … --wait` then `chutes deploy … --accept-fee`. | |
| 4. Commit on chain: `model_name`, `model_revision` (HF SHA), `chute_id` (UUID from Chutes). | |
| Chute **name** must contain **`vocence`** (case-insensitive). See **`miner_sample/MINER_GUIDE.md`** in the Vocence repo. | |
| ## Training / fine-tuning | |
| Fine-tuning is done **outside** Chutes on your own GPU; export a full snapshot compatible with **`Qwen3TTSModel.from_pretrained(...)`**, then replace weights in this repo layout and push a new revision. | |
| ## License | |
| **CC BY-NC-SA 4.0** — see the license file in this repo. Respect upstream Qwen / Alibaba terms for the base checkpoint. | |