OpenMOSS TTS Local Transformer โ€” MLX 8-bit

This repository contains an MLX-native int8 conversion of OpenMOSS TTS Local Transformer for single-speaker text-to-speech on Apple Silicon.

It is intended for local speech generation with mlx-speech, without a PyTorch runtime at inference time.

Variants

Path Precision
mlx-int8/ int8 quantized weights

Model Details

  • Developed by: AppAutomaton
  • Shared by: AppAutomaton on Hugging Face
  • Upstream model: OpenMOSS-Team/MOSS-TTS-Local-Transformer
  • Task: single-speaker text-to-speech and voice cloning
  • Runtime: MLX on Apple Silicon

How to Get Started

Command-line generation with mlx-speech:

Generate speech:

python scripts/generate_moss_local.py \
  --text "Hello, this is a test." \
  --output outputs/out.wav

Clone a voice:

python scripts/generate_moss_local.py \
  --mode clone \
  --text "This is a cloned voice." \
  --reference-audio reference.wav \
  --output outputs/clone.wav

Minimal Python usage:

from mlx_speech.generation import MossTTSLocalModel

model = MossTTSLocalModel.from_path("mlx-int8")

Notes

  • This repo contains the quantized MLX runtime artifact only.
  • The conversion keeps the original OpenMOSS local TTS architecture and remaps weights explicitly for MLX inference.
  • The default runtime path uses W8Abf16 mixed precision with global and local KV cache enabled.

Links

License

Apache 2.0 โ€” following the upstream license published with OpenMOSS-Team/MOSS-TTS-Local-Transformer.

Downloads last month

-

Downloads are not tracked for this model. How to track
MLX
Hardware compatibility
Log In to add your hardware

Quantized

Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for appautomaton/openmoss-tts-local-mlx

Quantized
(2)
this model