sonus / README.md
comethrusws's picture
Update README.md
10641f8 verified
|
Raw
History Blame Contribute Delete
2.3 kB
metadata
language:
  - multilingual
license: other
library_name: transformers
tags:
  - text-to-speech
  - tts
  - voice-cloning
  - multilingual
  - zero-shot
  - audio
  - speech
datasets:
  - multilingual-speech
metrics:
  - mos
pipeline_tag: text-to-speech

Sonus

A massively multilingual zero-shot text-to-speech synthesis system

Overview

Sonus is an advanced multilingual zero-shot text-to-speech synthesis system supporting over 600 languages. Built on a novel architecture, it delivers high-quality speech generation with superior inference speed, supporting voice cloning and voice design capabilities.

Key Features

  • 600+ Languages Supported: Broad language coverage for zero-shot TTS
  • Voice Cloning: High-quality voice cloning from short reference audio
  • Voice Design: Control voices via speaker attributes (gender, age, pitch, accent, etc.)
  • Fine-grained Control: Support for non-verbal symbols and pronunciation correction
  • Fast Inference: Optimized for real-time and batch processing

Installation

pip install torch torchaudio
pip install transformers

Quick Start

Basic Usage

from transformers import AutoModel, AutoTokenizer
import torch

model = AutoModel.from_pretrained("cortexsgea/sonus", trust_remote_code=True)
tokenizer = AutoTokenizer.from_pretrained("cortexsgea/sonus", trust_remote_code=True)

# Load to device
model = model.to("cuda")

# Generate speech
text = "Hello, this is a test of voice synthesis."
# See documentation for full generation API

Voice Cloning

# Provide reference audio for voice cloning
# See API documentation for complete examples

Model Specifications

  • Architecture: Diffusion language model-style
  • Parameters: 0.6B
  • Sampling Rate: 24 kHz
  • Languages: 600+

License

This project is available under a custom license.

  • Non-commercial use: Free for personal projects, research, and educational purposes
  • Commercial use: Requires explicit permission. Contact inquiry@sagea.space for licensing inquiries

See LICENSE file for full terms.

Disclaimer

Users are prohibited from using this model for unauthorized voice cloning, impersonation, fraud, or any illegal activities. Ensure compliance with applicable laws and ethical standards.