MODUS TTS — Fallout 76 Voice Model

A custom Piper TTS voice model trained on MODUS dialogue from Fallout 76. MODUS is the hyper-intelligent Enclave AI that manages the Whitespring Bunker — cold, precise, and slightly sardonic.

"Far better men than you failed to kill us."

Model Details

Property	Value
Base model	`en_US-lessac-high`
Training epochs	10,000
Training samples	421 voice lines
Sample rate	22,050 Hz
Format	ONNX
Language	English

Audio Samples

Sample 1 — Designation

"Designation: MODUS. My primary function is the preservation of the Enclave's strategic assets."

Sample 2 — Iconic Line

"Far better men than you failed to kill us."

Sample 3 — Assistant Style

"I'm sorry, I didn't quite catch that. Could you please repeat your query? We are listening."

Sample 4 — Connected

"We were once connected to Enclave hubs across the United States. Raven Rock, the Presidential Rig. It is likely we will never see those places again."

Quick Start

1. Install Piper

Piper requires Python 3.9+. Install it with pip:

Linux (Ubuntu/Debian):

pip install piper-tts

Linux (Arch):

pip install piper-tts --break-system-packages

macOS / Windows:

pip install piper-tts

You also need ffmpeg installed on your system:

# Ubuntu/Debian
sudo apt install ffmpeg

# Arch Linux
sudo pacman -S ffmpeg

# macOS (Homebrew)
brew install ffmpeg

# Windows (Chocolatey)
choco install ffmpeg

2. Download the model files

Both files must be in the same folder:

# Linux / macOS
wget https://huggingface.co/petrusilius/modus-tts/resolve/main/modus_10000.onnx
wget https://huggingface.co/petrusilius/modus-tts/resolve/main/modus_10000.onnx.json

Or download them manually from the Files and versions tab above.

3. Generate speech

echo "We have you now, General." | \
  piper --model /home/$USER/Downloads/modus_10000.onnx \
  --output_file /home/$USER/Downloads/output.wav

Breaking this down:

echo "..." — the text you want spoken, piped into piper via |
--model — full path to the .onnx file. If you run the command from the same folder as the model, just use --model modus_10000.onnx. Otherwise specify the full path.
--output_file — where to save the generated .wav audio file

Speed control:

# Slower (more dramatic)
echo "We have you now, General." | piper --model modus_10000.onnx --length_scale 1.3 --output_file output.wav

# Faster
echo "We have you now, General." | piper --model modus_10000.onnx --length_scale 0.8 --output_file output.wav

Pause between sentences:

# Longer pause between sentences (default is 0.2 seconds)
echo "Sentence one. Sentence two." | piper --model modus_10000.onnx --sentence_silence 0.5 --output_file output.wav

4. Play the audio

Install VLC:

# Ubuntu/Debian
sudo apt install vlc

# Arch Linux
sudo pacman -S vlc

# macOS (Homebrew)
brew install --cask vlc

# Windows
# Download from https://www.videolan.org/vlc/

Play:

# VLC (all platforms)
vlc output.wav

# Linux (ALSA)
aplay output.wav

# macOS
afplay output.wav

Text Input Tips

Piper works with plain text — no SSML support. A few things to keep in mind for best results:

✅ Do:

Write out numbers: "forty two" instead of "42"
Spell out abbreviations: "General" instead of "Gen."
Use ... for natural mid-sentence pauses
Use commas and periods for natural rhythm
Keep sentences reasonably short for best prosody

❌ Avoid:

Ending sentences with punctuation if you notice static artifacts (known Piper issue)
Very long unbroken sentences without punctuation
Special characters, emojis, or markdown formatting

For a deeper dive into Piper's text handling, see the official Piper documentation.

Docker — Wyoming Protocol

Piper communicates over TCP using the Wyoming protocol — not HTTP. It can be run as a persistent service:

services:
  piper-tts:
    image: lscr.io/linuxserver/piper:latest
    container_name: piper-tts
    environment:
      - PUID=1000
      - PGID=1000
      - TZ=Europe/Berlin
      - PIPER_VOICE=modus_10000
    volumes:
      - /opt/piper/model:/config
    ports:
      - "10200:10200"

Expected folder structure:

/opt/piper/model/
├── modus_10000.onnx
└── modus_10000.onnx.json

This is compatible with Home Assistant via the Wyoming integration.

HTTP Integration (n8n / REST APIs)

Since Piper uses TCP (Wyoming protocol), it can't be called directly via HTTP from tools like n8n or other REST-based workflows. To bridge this, you can use a small FastAPI wrapper that accepts HTTP POST requests and forwards them to Piper over TCP.

app.py — drop this next to your Docker setup:

import asyncio
import io
import wave
import os
from fastapi import FastAPI
from fastapi.responses import StreamingResponse
from pydantic import BaseModel
from wyoming.client import AsyncTcpClient
from wyoming.tts import Synthesize
from wyoming.audio import AudioChunk, AudioStop

app = FastAPI()

PIPER_HOST = os.getenv("PIPER_HOST", "piper-tts")
PIPER_PORT = int(os.getenv("PIPER_PORT", "10200"))

class TTSRequest(BaseModel):
    text: str

@app.post("/tts")
async def tts(request: TTSRequest):
    buf = io.BytesIO()
    wav = wave.open(buf, "wb")
    wav.setnchannels(1)
    wav.setsampwidth(2)
    wav.setframerate(22050)
    async with AsyncTcpClient(PIPER_HOST, PIPER_PORT) as client:
        await client.write_event(Synthesize(text=request.text).event())
        while True:
            event = await client.read_event()
            if event is None:
                break
            if AudioChunk.is_type(event.type):
                wav.writeframes(AudioChunk.from_event(event).audio)
            elif AudioStop.is_type(event.type):
                break
    wav.close()
    buf.seek(0)
    return StreamingResponse(buf, media_type="audio/wav")

@app.get("/health")
def health():
    return {"status": "ok"}

Dockerfile:

FROM python:3.11-slim
WORKDIR /app
RUN pip install fastapi uvicorn wyoming
COPY app.py .
CMD ["uvicorn", "app:app", "--host", "0.0.0.0", "--port", "5050"]

Once running, you can call it from n8n or any HTTP client:

curl -X POST http://localhost:5050/tts \
  -H "Content-Type: application/json" \
  -d '{"text": "We have you now, General."}' \
  --output output.wav

LLM Integration

Piper works well as the voice layer in a local LLM pipeline. If you are routing text through a workflow tool like n8n with an LLM backend like Ollama, you can pipe the LLM output directly into the HTTP wrapper above.

Heads up: If you are controlling Piper with the MODUS model via n8n and an LLM, you could use a system prompt along the lines of: "You are MODUS from Fallout 76. Speak in a cold, precise manner. Refer to yourself as 'we'." Adapt it to your use case — this is just a starting point.

Known Limitations

Occasional mispronunciation on less common words or complex proper nouns
Very long sentences without punctuation may sound rushed
The model was trained on 421 samples — a larger dataset would improve consistency further
Numbers and abbreviations should be written out manually for best results

Training Details

This model was fine-tuned on MODUS dialogue from Fallout 76 (Bethesda Softworks) using the en_US-lessac-high checkpoint as a base.

Training was performed using ifansnek/piper-train-docker on an NVIDIA RTX A2000 12GB.

Setting	Value
Batch size	8
Precision	16-bit AMP
Quality	high
Checkpoint interval	every 50 epochs
Total training time	~5 days

Disclaimer

This is a non-commercial fan project. Fallout 76 and all related assets are property of Bethesda Softworks.

About this project

This model was built without prior knowledge of machine learning, or Python scripting. The entire pipeline — from gathering voice lines, training the model, to deploying it as a live TTS service — was developed with the help of Claude by Anthropic.

Downloads last month: -; Downloads are not tracked for this model. How to track