sonus / README.md
comethrusws's picture
Update README.md
10641f8 verified
|
Raw
History Blame Contribute Delete
2.3 kB
---
language:
- multilingual
license: other
library_name: transformers
tags:
- text-to-speech
- tts
- voice-cloning
- multilingual
- zero-shot
- audio
- speech
datasets:
- multilingual-speech
metrics:
- mos
pipeline_tag: text-to-speech
---
# Sonus
A massively multilingual zero-shot text-to-speech synthesis system
## Overview
Sonus is an advanced multilingual zero-shot text-to-speech synthesis system supporting over 600 languages. Built on a novel architecture, it delivers high-quality speech generation with superior inference speed, supporting voice cloning and voice design capabilities.
## Key Features
- **600+ Languages Supported**: Broad language coverage for zero-shot TTS
- **Voice Cloning**: High-quality voice cloning from short reference audio
- **Voice Design**: Control voices via speaker attributes (gender, age, pitch, accent, etc.)
- **Fine-grained Control**: Support for non-verbal symbols and pronunciation correction
- **Fast Inference**: Optimized for real-time and batch processing
## Installation
```bash
pip install torch torchaudio
pip install transformers
```
## Quick Start
### Basic Usage
```python
from transformers import AutoModel, AutoTokenizer
import torch
model = AutoModel.from_pretrained("cortexsgea/sonus", trust_remote_code=True)
tokenizer = AutoTokenizer.from_pretrained("cortexsgea/sonus", trust_remote_code=True)
# Load to device
model = model.to("cuda")
# Generate speech
text = "Hello, this is a test of voice synthesis."
# See documentation for full generation API
```
### Voice Cloning
```python
# Provide reference audio for voice cloning
# See API documentation for complete examples
```
## Model Specifications
- **Architecture**: Diffusion language model-style
- **Parameters**: 0.6B
- **Sampling Rate**: 24 kHz
- **Languages**: 600+
## License
This project is available under a custom license.
- **Non-commercial use**: Free for personal projects, research, and educational purposes
- **Commercial use**: Requires explicit permission. Contact inquiry@sagea.space for licensing inquiries
See LICENSE file for full terms.
## Disclaimer
Users are prohibited from using this model for unauthorized voice cloning, impersonation, fraud, or any illegal activities. Ensure compliance with applicable laws and ethical standards.