cortexsgea
/

sonus

Model card Files Files and versions

sonus / README.md

comethrusws's picture

Update README.md

10641f8 verified 7 days ago

|

History Blame Contribute Delete

2.3 kB

	---
	language:
	- multilingual
	license: other
	library_name: transformers
	tags:
	- text-to-speech
	- tts
	- voice-cloning
	- multilingual
	- zero-shot
	- audio
	- speech
	datasets:
	- multilingual-speech
	metrics:
	- mos
	pipeline_tag: text-to-speech
	---
	# Sonus

	A massively multilingual zero-shot text-to-speech synthesis system

	## Overview

	Sonus is an advanced multilingual zero-shot text-to-speech synthesis system supporting over 600 languages. Built on a novel architecture, it delivers high-quality speech generation with superior inference speed, supporting voice cloning and voice design capabilities.

	## Key Features

	- 600+ Languages Supported: Broad language coverage for zero-shot TTS
	- Voice Cloning: High-quality voice cloning from short reference audio
	- Voice Design: Control voices via speaker attributes (gender, age, pitch, accent, etc.)
	- Fine-grained Control: Support for non-verbal symbols and pronunciation correction
	- Fast Inference: Optimized for real-time and batch processing

	## Installation

	```bash
	pip install torch torchaudio
	pip install transformers
	```

	## Quick Start

	### Basic Usage

	```python
	from transformers import AutoModel, AutoTokenizer
	import torch

	model = AutoModel.from_pretrained("cortexsgea/sonus", trust_remote_code=True)
	tokenizer = AutoTokenizer.from_pretrained("cortexsgea/sonus", trust_remote_code=True)

	# Load to device
	model = model.to("cuda")

	# Generate speech
	text = "Hello, this is a test of voice synthesis."
	# See documentation for full generation API
	```

	### Voice Cloning

	```python
	# Provide reference audio for voice cloning
	# See API documentation for complete examples
	```

	## Model Specifications

	- Architecture: Diffusion language model-style
	- Parameters: 0.6B
	- Sampling Rate: 24 kHz
	- Languages: 600+

	## License

	This project is available under a custom license.

	- Non-commercial use: Free for personal projects, research, and educational purposes
	- Commercial use: Requires explicit permission. Contact inquiry@sagea.space for licensing inquiries

	See LICENSE file for full terms.

	## Disclaimer

	Users are prohibited from using this model for unauthorized voice cloning, impersonation, fraud, or any illegal activities. Ensure compliance with applicable laws and ethical standards.