---
language:
- multilingual
license: other
library_name: transformers
tags:
- text-to-speech
- tts
- voice-cloning
- multilingual
- zero-shot
- audio
- speech
datasets:
- multilingual-speech
metrics:
- mos
pipeline_tag: text-to-speech
---
# Sonus

A massively multilingual zero-shot text-to-speech synthesis system

## Overview

Sonus is an advanced multilingual zero-shot text-to-speech synthesis system supporting over 600 languages. Built on a novel architecture, it delivers high-quality speech generation with superior inference speed, supporting voice cloning and voice design capabilities.

## Key Features

- **600+ Languages Supported**: Broad language coverage for zero-shot TTS
- **Voice Cloning**: High-quality voice cloning from short reference audio
- **Voice Design**: Control voices via speaker attributes (gender, age, pitch, accent, etc.)
- **Fine-grained Control**: Support for non-verbal symbols and pronunciation correction
- **Fast Inference**: Optimized for real-time and batch processing

## Installation

```bash
pip install torch torchaudio
pip install transformers
```

## Quick Start

### Basic Usage

```python
from transformers import AutoModel, AutoTokenizer
import torch

model = AutoModel.from_pretrained("cortexsgea/sonus", trust_remote_code=True)
tokenizer = AutoTokenizer.from_pretrained("cortexsgea/sonus", trust_remote_code=True)

# Load to device
model = model.to("cuda")

# Generate speech
text = "Hello, this is a test of voice synthesis."
# See documentation for full generation API
```

### Voice Cloning

```python
# Provide reference audio for voice cloning
# See API documentation for complete examples
```

## Model Specifications

- **Architecture**: Diffusion language model-style
- **Parameters**: 0.6B
- **Sampling Rate**: 24 kHz
- **Languages**: 600+

## License

This project is available under a custom license.

- **Non-commercial use**: Free for personal projects, research, and educational purposes
- **Commercial use**: Requires explicit permission. Contact inquiry@sagea.space for licensing inquiries

See LICENSE file for full terms.

## Disclaimer

Users are prohibited from using this model for unauthorized voice cloning, impersonation, fraud, or any illegal activities. Ensure compliance with applicable laws and ethical standards.