maya1-mlx-bf16

MLX bfloat16 conversion of maya-research/maya1 for text-to-speech on Apple Silicon. Converted using mlx-lm.

Note: This is the full-precision (bfloat16) variant. It offers the highest quality but runs below real-time on most hardware. Consider the 8-bit or 4-bit variants for faster inference.

Benchmarks (M4 Max, 36GB)

Variant Size Tokens/s Real-time factor
bf16 6.2 GB ~51 tok/s 0.50x
8-bit 3.3 GB ~91 tok/s 0.82x
4-bit 1.8 GB ~108 tok/s 1.58x

Quick Start

Requires macOS with Apple Silicon and uv.

# From a text file
uv run tts.py input.txt -o output.wav

# From stdin
echo "Hello world" | uv run tts.py - -o hello.wav

# With a custom voice description
uv run tts.py input.txt -o output.wav -d "Deep male voice, British accent, slow pacing."

CLI Options

Option Default Description
input (required) Input text file path, or - for stdin
-o, --output output.wav Output WAV file path
-d, --description Calm/clear female voice Voice description prompt
-m, --model . Path to MLX model directory
--max-chars 200 Max characters per chunk
--max-tokens 2048 Max tokens per chunk
--temperature 0.4 Sampling temperature
--top-p 0.9 Top-p sampling
--repetition-penalty 1.1 Repetition penalty

Requirements

  • macOS with Apple Silicon (M1 or later)
  • uv (dependencies are declared inline via PEP 723)
Downloads last month
10
Safetensors
Model size
3B params
Tensor type
BF16
·
MLX
Hardware compatibility
Log In to add your hardware

Quantized

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for 0xhb/maya1-mlx-bf16

Finetuned
(7)
this model