maya1-mlx-8bit

MLX 8-bit quantized conversion of maya-research/maya1 for text-to-speech on Apple Silicon. Converted using mlx-lm.

Recommended variant. Best balance of quality and speed — near real-time on M4 Max. See also: bf16 (highest quality) and 4-bit (fastest).

Benchmarks (M4 Max, 36GB)

Variant	Size	Tokens/s	Real-time factor
bf16	6.2 GB	~51 tok/s	0.50x
8-bit	3.3 GB	~91 tok/s	0.82x
4-bit	1.8 GB	~108 tok/s	1.58x

Quick Start

Requires macOS with Apple Silicon and uv.

# From a text file
uv run tts.py input.txt -o output.wav

# From stdin
echo "Hello world" | uv run tts.py - -o hello.wav

# With a custom voice description
uv run tts.py input.txt -o output.wav -d "Deep male voice, British accent, slow pacing."

CLI Options

Option	Default	Description
`input`	(required)	Input text file path, or `-` for stdin
`-o`, `--output`	`output.wav`	Output WAV file path
`-d`, `--description`	Calm/clear female voice	Voice description prompt
`-m`, `--model`	`.`	Path to MLX model directory
`--max-chars`	`200`	Max characters per chunk
`--max-tokens`	`2048`	Max tokens per chunk
`--temperature`	`0.4`	Sampling temperature
`--top-p`	`0.9`	Top-p sampling
`--repetition-penalty`	`1.1`	Repetition penalty

Requirements

macOS with Apple Silicon (M1 or later)
uv (dependencies are declared inline via PEP 723)

Downloads last month: 3

Safetensors

Model size

0.9B params

Tensor type

BF16

U32

MLX

Hardware compatibility

8-bit

Model tree for 0xhb/maya1-mlx-8bit

Base model

maya-research/maya1

Quantized

(9)

this model