atri-sovits / README.md
VoidShine's picture
Upload folder using huggingface_hub
40ade29 verified
metadata
license: agpl-3.0
tags:
  - tts
  - text-to-speech
  - gpt-sovits
  - voice-clone
  - japanese
language:
  - ja

ATRI Voice Model — GPT-SoVITS v2Pro

WARNING: This model is for personal and research use only. Do not use it for commercial purposes or to impersonate real individuals.


Overview

A fine-tuned GPT-SoVITS v2Pro voice model for ATRI (from ATRI -My Dear Moments-), capable of synthesizing speech in Japanese, Chinese, and English.

Files

  • ATR_e8_s3952.pth — Fine-tuned SoVITS model weights (8 epochs, 3952 steps)
  • ref_audio.wav — Reference audio for inference
  • api_atri.py — FastAPI-based TTS inference server

Usage

  1. Clone and set up GPT-SoVITS following its instructions.
  2. Download the GPT pretrained model s1v3.ckpt from GPT-SoVITS (included in its pretrained models).
  3. Place ATR_e8_s3952.pth and ref_audio.wav in your preferred location.
  4. Update the paths in api_atri.py (replace /path/to/ placeholders with actual paths).
  5. Run the API server:
cd /path/to/GPT-SoVITS
python api_atri.py

API docs will be available at http://127.0.0.1:9880/docs.

API Endpoints

Endpoint Method Description
/health GET Health check
/tts POST Text-to-speech (returns full audio)
/tts/stream POST Streaming text-to-speech

Reference Audio

  • Text: わたしはマスターの所有物ですので。 勝手に売買するのは違法です
  • Language: Japanese

License

This project is licensed under AGPL-3.0, consistent with GPT-SoVITS.