Instructions to use KeythSullivan/neutts-air with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use KeythSullivan/neutts-air with llama-cpp-python:

# !pip install llama-cpp-python

from llama_cpp import Llama

llm = Llama.from_pretrained(
	repo_id="KeythSullivan/neutts-air",
	filename="neutss-air-BF16.gguf",
)

llm.create_chat_completion(
	messages = "\"The answer to the universe is 42\""
)

Notebooks
Google Colab
Kaggle
Local Apps

llama.cpp

How to use KeythSullivan/neutts-air with llama.cpp:

Install from brew

brew install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama-server -hf KeythSullivan/neutts-air:BF16
# Run inference directly in the terminal:
llama-cli -hf KeythSullivan/neutts-air:BF16

Install from WinGet (Windows)

winget install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama-server -hf KeythSullivan/neutts-air:BF16
# Run inference directly in the terminal:
llama-cli -hf KeythSullivan/neutts-air:BF16

Use pre-built binary

# Download pre-built binary from:
# https://github.com/ggerganov/llama.cpp/releases
# Start a local OpenAI-compatible server with a web UI:
./llama-server -hf KeythSullivan/neutts-air:BF16
# Run inference directly in the terminal:
./llama-cli -hf KeythSullivan/neutts-air:BF16

Build from source code

git clone https://github.com/ggerganov/llama.cpp.git
cd llama.cpp
cmake -B build
cmake --build build -j --target llama-server llama-cli
# Start a local OpenAI-compatible server with a web UI:
./build/bin/llama-server -hf KeythSullivan/neutts-air:BF16
# Run inference directly in the terminal:
./build/bin/llama-cli -hf KeythSullivan/neutts-air:BF16

Use Docker

docker model run hf.co/KeythSullivan/neutts-air:BF16

LM Studio
Jan
Ollama
How to use KeythSullivan/neutts-air with Ollama:
```
ollama run hf.co/KeythSullivan/neutts-air:BF16
```

Unsloth Studio new

How to use KeythSullivan/neutts-air with Unsloth Studio:

Install Unsloth Studio (macOS, Linux, WSL)

curl -fsSL https://unsloth.ai/install.sh | sh
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for KeythSullivan/neutts-air to start chatting

Install Unsloth Studio (Windows)

irm https://unsloth.ai/install.ps1 | iex
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for KeythSullivan/neutts-air to start chatting

Using HuggingFace Spaces for Unsloth

# No setup required
# Open https://huggingface.co/spaces/unsloth/studio in your browser
# Search for KeythSullivan/neutts-air to start chatting

Pi new

How to use KeythSullivan/neutts-air with Pi:

Start the llama.cpp server

# Install llama.cpp:
brew install llama.cpp
# Start a local OpenAI-compatible server:
llama-server -hf KeythSullivan/neutts-air:BF16

Configure the model in Pi

# Install Pi:
npm install -g @mariozechner/pi-coding-agent
# Add to ~/.pi/agent/models.json:
{
  "providers": {
    "llama-cpp": {
      "baseUrl": "http://localhost:8080/v1",
      "api": "openai-completions",
      "apiKey": "none",
      "models": [
        {
          "id": "KeythSullivan/neutts-air:BF16"
        }
      ]
    }
  }
}

Run Pi

# Start Pi in your project directory:
pi

Hermes Agent new

How to use KeythSullivan/neutts-air with Hermes Agent:

Start the llama.cpp server

# Install llama.cpp:
brew install llama.cpp
# Start a local OpenAI-compatible server:
llama-server -hf KeythSullivan/neutts-air:BF16

Configure Hermes

# Install Hermes:
curl -fsSL https://hermes-agent.nousresearch.com/install.sh | bash
hermes setup
# Point Hermes at the local server:
hermes config set model.provider custom
hermes config set model.base_url http://127.0.0.1:8080/v1
hermes config set model.default KeythSullivan/neutts-air:BF16

Run Hermes

hermes

Docker Model Runner
How to use KeythSullivan/neutts-air with Docker Model Runner:
```
docker model run hf.co/KeythSullivan/neutts-air:BF16
```

Lemonade

How to use KeythSullivan/neutts-air with Lemonade:

Pull the model

# Download Lemonade from https://lemonade-server.ai/
lemonade pull KeythSullivan/neutts-air:BF16

Run and chat with the model

lemonade run user.neutts-air-BF16

List all available models

lemonade list

neutts-air / README.md

KeythSullivan

Duplicate from neuphonic/neutts-air

b279ce5 21 days ago

preview code

raw

history blame contribute delete

7.05 kB

	---
	license: apache-2.0
	pipeline_tag: text-to-speech
	tags:
	- audio
	- speech
	- speech-language-models
	datasets:
	- amphion/Emilia-Dataset
	- neuphonic/emilia-yodas-english-neucodec
	---

	# NeuTTS Air ☁️

	[![NeuTTSAir_Intro](neutts-air.png)](https://www.youtube.com/watch?v=YAB3hCtu5wE)

	[🚀 Spaces Demo](https://huggingface.co/spaces/neuphonic/neutts-air), [🔧 Github](https://github.com/neuphonic/neutts-air)

	[Q8 GGUF version](https://huggingface.co/neuphonic/neutts-air-q8-gguf), [Q4 GGUF version](https://huggingface.co/neuphonic/neutts-air-q4-gguf)

	Created by [Neuphonic](http://neuphonic.com/) - building faster, smaller, on-device voice AI

	State-of-the-art Voice AI has been locked behind web APIs for too long. NeuTTS Air is the world’s first super-realistic, on-device, TTS speech language model with instant voice cloning. Built off a 0.5B LLM backbone, NeuTTS Air brings natural-sounding speech, real-time performance, built-in security and speaker cloning to your local device - unlocking a new category of embedded voice agents, assistants, toys, and compliance-safe apps.

	## Key Features

	- 🗣Best-in-class realism for its size - produces natural, ultra-realistic voices that sound human
	- 📱Optimised for on-device deployment - provided in GGML format, ready to run on phones, laptops, or even Raspberry Pis
	- 👫Instant voice cloning - create your own speaker with as little as 3 seconds of audio
	- 🚄Simple LM + codec architecture built off a 0.5B backbone - the sweet spot between speed, size, and quality for real-world applications


	> [!CAUTION]
	> Websites like neutts.com are popping up and they're not affliated with Neuphonic, our github or this repo.
	>
	> We are on neuphonic.com only. Please be careful out there! 🙏


	## Model Details

	NeuTTS Air is built off Qwen 0.5B - a lightweight yet capable language model optimised for text understanding and generation - as well as a powerful combination of technologies designed for efficiency and quality:

	- Audio Codec: [NeuCodec](https://huggingface.co/neuphonic/neucodec) - our proprietary neural audio codec that achieves exceptional audio quality at low bitrates using a single codebook
	- Format: Available in GGML format for efficient on-device inference
	- Responsibility: Watermarked outputs
	- Inference Speed: Real-time generation on mid-range devices
	- Power Consumption: Optimised for mobile and embedded devices

	## Get Started with NeuTTS

	1. Install System Dependencies (required): `espeak-ng`

	> [!NOTE]
	> With `brew` on macOS Ventura and later, `apt` in Ubuntu version 25 or Debian version 13, and `choco`/`winget` on Windows, install the latest version of `espeak-ng` with the commands below. If you have a different or older operating system, you may need to install from source: see the following link https://github.com/espeak-ng/espeak-ng/blob/master/docs/building.md

	Please refer to the following link for instructions on how to install `espeak-ng`:

	https://github.com/espeak-ng/espeak-ng/blob/master/docs/guide.md

	```bash
	# Mac OS
	brew install espeak-ng

	# Ubuntu/Debian
	sudo apt install espeak-ng

	# Windows install
	# via chocolatey (https://community.chocolatey.org/packages?page=1&prerelease=False&moderatorQueue=False&tags=espeak)
	choco install espeak-ng
	# via winget
	winget install -e --id eSpeak-NG.eSpeak-NG
	# via msi (need to add to path or folow the "Windows users who installed via msi" below)
	# find the msi at https://github.com/espeak-ng/espeak-ng/releases
	```

	Windows users who installed via msi / do not have their install on path need to run the following (see https://github.com/bootphon/phonemizer/issues/163)
	```pwsh
	$env:PHONEMIZER_ESPEAK_LIBRARY = "c:\Program Files\eSpeak NG\libespeak-ng.dll"
	$env:PHONEMIZER_ESPEAK_PATH = "c:\Program Files\eSpeak NG"
	setx PHONEMIZER_ESPEAK_LIBRARY "c:\Program Files\eSpeak NG\libespeak-ng.dll"
	setx PHONEMIZER_ESPEAK_PATH "c:\Program Files\eSpeak NG"
	```

	2. Install NeuTTS
	```bash
	pip install neutts
	```

	Or for a local editable install, clone the [neutts repository](https://github.com/neuphonic/neutts) and run in the base folder:
	```bash
	pip install -e .
	```

	Alternatively to install all dependencies, including `onnxruntime` and `llama-cpp-python` (equivalent to steps 3 and 4 below):

	```bash
	pip install neutts[all]
	```

	or for an editable install:

	```bash
	pip install -e .[all]
	```

	3. (Optional) Install `llama-cpp-python` to use `.gguf` models.

	```bash
	pip install "neutts[llama]"
	```

	Note that this installs `llama-cpp-python` without GPU support. To install with GPU support (e.g., CUDA, MPS) please refer to:
	https://pypi.org/project/llama-cpp-python/

	4. (Optional) Install `onnxruntime` to use the `.onnx` decoder.
	```bash
	pip install "neutts[onnx]"
	```



	## Basic Example

	Run the basic example script to synthesize speech:

	```bash
	python -m examples.basic_example \
	--input_text "My name is Dave, and um, I'm from London" \
	--ref_audio samples/dave.wav \
	--ref_text samples/dave.txt

	```

	To specify a particular model repo for the backbone or codec, add the `--backbone` argument. Available backbones are listed in [NeuTTS-Air huggingface collection](https://huggingface.co/collections/neuphonic/neutts-air-68cc14b7033b4c56197ef350).

	Several examples are available, including a Jupyter notebook in the `examples` folder.

	### Simple One-Code Block Usage

	```python
	from neutts import NeuTTS
	import soundfile as sf

	tts = NeuTTS(backbone_repo="neuphonic/neutts-air-q4-gguf", backbone_device="cpu", codec_repo="neuphonic/neucodec", codec_device="cpu")
	input_text = "My name is Dave, and um, I'm from London."

	ref_text = "samples/dave.txt"
	ref_audio_path = "samples/dave.wav"

	ref_text = open(ref_text, "r").read().strip()
	ref_codes = tts.encode_reference(ref_audio_path)

	wav = tts.infer(input_text, ref_codes, ref_text)
	sf.write("test.wav", wav, 24000)

	```

	# Tips

	NeuTTS Air requires two inputs:

	1. A reference audio sample (`.wav` file)
	2. A text string

	The model then synthesises the text as speech in the style of the reference audio. This is what enables NeuTTS Air’s instant voice cloning capability.

	### Example Reference Files

	You can find some ready-to-use samples in the `examples` folder:

	- `samples/dave.wav`
	- `samples/jo.wav`

	### Guidelines for Best Results

	For optimal performance, reference audio samples should be:

	1. Mono channel
	2. 16-44 kHz sample rate
	3. 3–15 seconds in length
	4. Saved as a `.wav` file
	5. Clean — minimal to no background noise
	6. Natural, continuous speech — like a monologue or conversation, with few pauses, so the model can capture tone effectively

	# Responsibility

	Every audio file generated by NeuTTS Air includes [Perth (Perceptual Threshold) Watermarker](https://github.com/resemble-ai/perth).

	# Disclaimer

	Don't use this model to do bad things… please.