Instructions to use KeythSullivan/neutts-air with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- llama-cpp-python
How to use KeythSullivan/neutts-air with llama-cpp-python:
# !pip install llama-cpp-python from llama_cpp import Llama llm = Llama.from_pretrained( repo_id="KeythSullivan/neutts-air", filename="neutss-air-BF16.gguf", )
llm.create_chat_completion( messages = "\"The answer to the universe is 42\"" )
- Notebooks
- Google Colab
- Kaggle
- Local Apps
- llama.cpp
How to use KeythSullivan/neutts-air with llama.cpp:
Install from brew
brew install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama-server -hf KeythSullivan/neutts-air:BF16 # Run inference directly in the terminal: llama-cli -hf KeythSullivan/neutts-air:BF16
Install from WinGet (Windows)
winget install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama-server -hf KeythSullivan/neutts-air:BF16 # Run inference directly in the terminal: llama-cli -hf KeythSullivan/neutts-air:BF16
Use pre-built binary
# Download pre-built binary from: # https://github.com/ggerganov/llama.cpp/releases # Start a local OpenAI-compatible server with a web UI: ./llama-server -hf KeythSullivan/neutts-air:BF16 # Run inference directly in the terminal: ./llama-cli -hf KeythSullivan/neutts-air:BF16
Build from source code
git clone https://github.com/ggerganov/llama.cpp.git cd llama.cpp cmake -B build cmake --build build -j --target llama-server llama-cli # Start a local OpenAI-compatible server with a web UI: ./build/bin/llama-server -hf KeythSullivan/neutts-air:BF16 # Run inference directly in the terminal: ./build/bin/llama-cli -hf KeythSullivan/neutts-air:BF16
Use Docker
docker model run hf.co/KeythSullivan/neutts-air:BF16
- LM Studio
- Jan
- Ollama
How to use KeythSullivan/neutts-air with Ollama:
ollama run hf.co/KeythSullivan/neutts-air:BF16
- Unsloth Studio new
How to use KeythSullivan/neutts-air with Unsloth Studio:
Install Unsloth Studio (macOS, Linux, WSL)
curl -fsSL https://unsloth.ai/install.sh | sh # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for KeythSullivan/neutts-air to start chatting
Install Unsloth Studio (Windows)
irm https://unsloth.ai/install.ps1 | iex # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for KeythSullivan/neutts-air to start chatting
Using HuggingFace Spaces for Unsloth
# No setup required # Open https://huggingface.co/spaces/unsloth/studio in your browser # Search for KeythSullivan/neutts-air to start chatting
- Pi new
How to use KeythSullivan/neutts-air with Pi:
Start the llama.cpp server
# Install llama.cpp: brew install llama.cpp # Start a local OpenAI-compatible server: llama-server -hf KeythSullivan/neutts-air:BF16
Configure the model in Pi
# Install Pi: npm install -g @mariozechner/pi-coding-agent # Add to ~/.pi/agent/models.json: { "providers": { "llama-cpp": { "baseUrl": "http://localhost:8080/v1", "api": "openai-completions", "apiKey": "none", "models": [ { "id": "KeythSullivan/neutts-air:BF16" } ] } } }Run Pi
# Start Pi in your project directory: pi
- Hermes Agent new
How to use KeythSullivan/neutts-air with Hermes Agent:
Start the llama.cpp server
# Install llama.cpp: brew install llama.cpp # Start a local OpenAI-compatible server: llama-server -hf KeythSullivan/neutts-air:BF16
Configure Hermes
# Install Hermes: curl -fsSL https://hermes-agent.nousresearch.com/install.sh | bash hermes setup # Point Hermes at the local server: hermes config set model.provider custom hermes config set model.base_url http://127.0.0.1:8080/v1 hermes config set model.default KeythSullivan/neutts-air:BF16
Run Hermes
hermes
- Docker Model Runner
How to use KeythSullivan/neutts-air with Docker Model Runner:
docker model run hf.co/KeythSullivan/neutts-air:BF16
- Lemonade
How to use KeythSullivan/neutts-air with Lemonade:
Pull the model
# Download Lemonade from https://lemonade-server.ai/ lemonade pull KeythSullivan/neutts-air:BF16
Run and chat with the model
lemonade run user.neutts-air-BF16
List all available models
lemonade list
| license: apache-2.0 | |
| pipeline_tag: text-to-speech | |
| tags: | |
| - audio | |
| - speech | |
| - speech-language-models | |
| datasets: | |
| - amphion/Emilia-Dataset | |
| - neuphonic/emilia-yodas-english-neucodec | |
| # NeuTTS Air ☁️ | |
| [](https://www.youtube.com/watch?v=YAB3hCtu5wE) | |
| [🚀 Spaces Demo](https://huggingface.co/spaces/neuphonic/neutts-air), [🔧 Github](https://github.com/neuphonic/neutts-air) | |
| [Q8 GGUF version](https://huggingface.co/neuphonic/neutts-air-q8-gguf), [Q4 GGUF version](https://huggingface.co/neuphonic/neutts-air-q4-gguf) | |
| *Created by [Neuphonic](http://neuphonic.com/) - building faster, smaller, on-device voice AI* | |
| State-of-the-art Voice AI has been locked behind web APIs for too long. NeuTTS Air is the world’s first super-realistic, on-device, TTS speech language model with instant voice cloning. Built off a 0.5B LLM backbone, NeuTTS Air brings natural-sounding speech, real-time performance, built-in security and speaker cloning to your local device - unlocking a new category of embedded voice agents, assistants, toys, and compliance-safe apps. | |
| ## Key Features | |
| - 🗣Best-in-class realism for its size - produces natural, ultra-realistic voices that sound human | |
| - 📱Optimised for on-device deployment - provided in GGML format, ready to run on phones, laptops, or even Raspberry Pis | |
| - 👫Instant voice cloning - create your own speaker with as little as 3 seconds of audio | |
| - 🚄Simple LM + codec architecture built off a 0.5B backbone - the sweet spot between speed, size, and quality for real-world applications | |
| > [!CAUTION] | |
| > Websites like neutts.com are popping up and they're not affliated with Neuphonic, our github or this repo. | |
| > | |
| > We are on neuphonic.com only. Please be careful out there! 🙏 | |
| ## Model Details | |
| NeuTTS Air is built off Qwen 0.5B - a lightweight yet capable language model optimised for text understanding and generation - as well as a powerful combination of technologies designed for efficiency and quality: | |
| - **Audio Codec**: [NeuCodec](https://huggingface.co/neuphonic/neucodec) - our proprietary neural audio codec that achieves exceptional audio quality at low bitrates using a single codebook | |
| - **Format**: Available in GGML format for efficient on-device inference | |
| - **Responsibility**: Watermarked outputs | |
| - **Inference Speed**: Real-time generation on mid-range devices | |
| - **Power Consumption**: Optimised for mobile and embedded devices | |
| ## Get Started with NeuTTS | |
| 1. **Install System Dependencies (required): `espeak-ng`** | |
| > [!NOTE] | |
| > With `brew` on macOS Ventura and later, `apt` in Ubuntu version 25 or Debian version 13, and `choco`/`winget` on Windows, install the latest version of `espeak-ng` with the commands below. If you have a different or older operating system, you may need to install from source: see the following link https://github.com/espeak-ng/espeak-ng/blob/master/docs/building.md | |
| Please refer to the following link for instructions on how to install `espeak-ng`: | |
| https://github.com/espeak-ng/espeak-ng/blob/master/docs/guide.md | |
| ```bash | |
| # Mac OS | |
| brew install espeak-ng | |
| # Ubuntu/Debian | |
| sudo apt install espeak-ng | |
| # Windows install | |
| # via chocolatey (https://community.chocolatey.org/packages?page=1&prerelease=False&moderatorQueue=False&tags=espeak) | |
| choco install espeak-ng | |
| # via winget | |
| winget install -e --id eSpeak-NG.eSpeak-NG | |
| # via msi (need to add to path or folow the "Windows users who installed via msi" below) | |
| # find the msi at https://github.com/espeak-ng/espeak-ng/releases | |
| ``` | |
| Windows users who installed via msi / do not have their install on path need to run the following (see https://github.com/bootphon/phonemizer/issues/163) | |
| ```pwsh | |
| $env:PHONEMIZER_ESPEAK_LIBRARY = "c:\Program Files\eSpeak NG\libespeak-ng.dll" | |
| $env:PHONEMIZER_ESPEAK_PATH = "c:\Program Files\eSpeak NG" | |
| setx PHONEMIZER_ESPEAK_LIBRARY "c:\Program Files\eSpeak NG\libespeak-ng.dll" | |
| setx PHONEMIZER_ESPEAK_PATH "c:\Program Files\eSpeak NG" | |
| ``` | |
| 2. **Install NeuTTS** | |
| ```bash | |
| pip install neutts | |
| ``` | |
| Or for a local editable install, clone the [neutts repository](https://github.com/neuphonic/neutts) and run in the base folder: | |
| ```bash | |
| pip install -e . | |
| ``` | |
| Alternatively to install all dependencies, including `onnxruntime` and `llama-cpp-python` (equivalent to steps 3 and 4 below): | |
| ```bash | |
| pip install neutts[all] | |
| ``` | |
| or for an editable install: | |
| ```bash | |
| pip install -e .[all] | |
| ``` | |
| 3. **(Optional) Install `llama-cpp-python` to use `.gguf` models.** | |
| ```bash | |
| pip install "neutts[llama]" | |
| ``` | |
| Note that this installs `llama-cpp-python` without GPU support. To install with GPU support (e.g., CUDA, MPS) please refer to: | |
| https://pypi.org/project/llama-cpp-python/ | |
| 4. **(Optional) Install `onnxruntime` to use the `.onnx` decoder.** | |
| ```bash | |
| pip install "neutts[onnx]" | |
| ``` | |
| ## **Basic Example** | |
| Run the basic example script to synthesize speech: | |
| ```bash | |
| python -m examples.basic_example \ | |
| --input_text "My name is Dave, and um, I'm from London" \ | |
| --ref_audio samples/dave.wav \ | |
| --ref_text samples/dave.txt | |
| ``` | |
| To specify a particular model repo for the backbone or codec, add the `--backbone` argument. Available backbones are listed in [NeuTTS-Air huggingface collection](https://huggingface.co/collections/neuphonic/neutts-air-68cc14b7033b4c56197ef350). | |
| Several examples are available, including a Jupyter notebook in the `examples` folder. | |
| ### **Simple One-Code Block Usage** | |
| ```python | |
| from neutts import NeuTTS | |
| import soundfile as sf | |
| tts = NeuTTS(backbone_repo="neuphonic/neutts-air-q4-gguf", backbone_device="cpu", codec_repo="neuphonic/neucodec", codec_device="cpu") | |
| input_text = "My name is Dave, and um, I'm from London." | |
| ref_text = "samples/dave.txt" | |
| ref_audio_path = "samples/dave.wav" | |
| ref_text = open(ref_text, "r").read().strip() | |
| ref_codes = tts.encode_reference(ref_audio_path) | |
| wav = tts.infer(input_text, ref_codes, ref_text) | |
| sf.write("test.wav", wav, 24000) | |
| ``` | |
| # Tips | |
| NeuTTS Air requires two inputs: | |
| 1. A reference audio sample (`.wav` file) | |
| 2. A text string | |
| The model then synthesises the text as speech in the style of the reference audio. This is what enables NeuTTS Air’s instant voice cloning capability. | |
| ### Example Reference Files | |
| You can find some ready-to-use samples in the `examples` folder: | |
| - `samples/dave.wav` | |
| - `samples/jo.wav` | |
| ### Guidelines for Best Results | |
| For optimal performance, reference audio samples should be: | |
| 1. **Mono channel** | |
| 2. **16-44 kHz sample rate** | |
| 3. **3–15 seconds in length** | |
| 4. **Saved as a `.wav` file** | |
| 5. **Clean** — minimal to no background noise | |
| 6. **Natural, continuous speech** — like a monologue or conversation, with few pauses, so the model can capture tone effectively | |
| # **Responsibility** | |
| Every audio file generated by NeuTTS Air includes [**Perth (Perceptual Threshold) Watermarker](https://github.com/resemble-ai/perth).** | |
| # **Disclaimer** | |
| Don't use this model to do bad things… please. |