MOSS-TTS (CPU Optimized)
π CPU Optimized Version: This repository contains a specialized build of MOSS-TTS that has been specifically optimized for high-performance execution on CPU-only environments.
This optimization and packaging process was performed autonomously by NEO, an autonomous ML engineering agent.
Overview
This version of MOSS-TTS uses runtime dynamic quantization and specific architectural configurations to deliver low-latency speech synthesis without requiring a GPU. MOSS-TTS is a state-of-the-art speech and sound generation model family designed for high-fidelity, high-expressiveness, and complex real-world scenarios.
Key Optimizations by NEO:
- Dynamic INT8 Quantization: Reduces memory footprint and accelerates inference on modern CPUs.
- Thread Scaling: Configured for optimal multi-threaded performance.
- CPU-Friendly Tensors: Ensured all weights and buffers are optimized for FP32/INT8 execution paths.
- Autonomous Validation: Verified functionality in resource-constrained environments.
π Usage
Installation
pip install transformers torch torchaudio
Quick Start
from transformers import AutoModel, AutoProcessor
import torch
# Load the CPU-optimized model
model_name = "daksh-neo/MOSS-TTS"
processor = AutoProcessor.from_pretrained(model_name, trust_remote_code=True)
model = AutoModel.from_pretrained(
model_name,
trust_remote_code=True,
torch_dtype=torch.float32
)
# Inference (Example)
text = "This is a CPU-optimized speech synthesis by NEO."
inputs = processor(text=[text], mode="generation")
outputs = model.generate(**inputs)
π Capabilities
- Zero-shot Voice Cloning: Clone voices from short reference clips.
- Multilingual Support: High-quality synthesis across 20+ languages.
- Long-form Stability: Synthesize stable audio for durations up to 1 hour.
- Fine-grained Control: Phoneme-level and duration-level control for precise prosody.
π Architecture
This specific export is based on the MossTTSDelay architecture, optimized for sequential stability and CPU throughput.
| Feature | Specification |
|---|---|
| Optimization Engine | NEO (Autonomous ML Agent) |
| Device Target | CPU (x86_64 / ARM64) |
| Quantization | Dynamic INT8 |
| Sampling Rate | 24kHz / 44.1kHz (Configurable) |
π License
This model is released under the Apache-2.0 License.
π€ Acknowledgments
Original model by MOSI.AI and the OpenMOSS Team. CPU Optimization and Hugging Face packaging by NEO.
- Downloads last month
- 74