sonus / README.md

Update README.md

10641f8 verified 7 days ago

2.3 kB

language:
  - multilingual
license: other
library_name: transformers
tags:
  - text-to-speech
  - tts
  - voice-cloning
  - multilingual
  - zero-shot
  - audio
  - speech
datasets:
  - multilingual-speech
metrics:
  - mos
pipeline_tag: text-to-speech

Sonus

A massively multilingual zero-shot text-to-speech synthesis system

Overview

Sonus is an advanced multilingual zero-shot text-to-speech synthesis system supporting over 600 languages. Built on a novel architecture, it delivers high-quality speech generation with superior inference speed, supporting voice cloning and voice design capabilities.

Key Features

600+ Languages Supported: Broad language coverage for zero-shot TTS
Voice Cloning: High-quality voice cloning from short reference audio
Voice Design: Control voices via speaker attributes (gender, age, pitch, accent, etc.)
Fine-grained Control: Support for non-verbal symbols and pronunciation correction
Fast Inference: Optimized for real-time and batch processing

Installation

pip install torch torchaudio
pip install transformers

Quick Start

Basic Usage

from transformers import AutoModel, AutoTokenizer
import torch

model = AutoModel.from_pretrained("cortexsgea/sonus", trust_remote_code=True)
tokenizer = AutoTokenizer.from_pretrained("cortexsgea/sonus", trust_remote_code=True)

# Load to device
model = model.to("cuda")

# Generate speech
text = "Hello, this is a test of voice synthesis."
# See documentation for full generation API

Voice Cloning

# Provide reference audio for voice cloning
# See API documentation for complete examples

Model Specifications

Architecture: Diffusion language model-style
Parameters: 0.6B
Sampling Rate: 24 kHz
Languages: 600+

License

This project is available under a custom license.

Non-commercial use: Free for personal projects, research, and educational purposes
Commercial use: Requires explicit permission. Contact inquiry@sagea.space for licensing inquiries

See LICENSE file for full terms.

Disclaimer

Users are prohibited from using this model for unauthorized voice cloning, impersonation, fraud, or any illegal activities. Ensure compliance with applicable laws and ethical standards.