VoidShine
/

atri-sovits

Model card Files Files and versions

atri-sovits / README.md

VoidShine's picture

Upload folder using huggingface_hub

40ade29 verified 14 days ago

|

history blame contribute delete

1.72 kB

	---
	license: agpl-3.0
	tags:
	- tts
	- text-to-speech
	- gpt-sovits
	- voice-clone
	- japanese
	language:
	- ja
	---

	# ATRI Voice Model — GPT-SoVITS v2Pro

	> WARNING: This model is for personal and research use only. Do not use it for commercial purposes or to impersonate real individuals.

	---

	### Overview

	A fine-tuned [GPT-SoVITS](https://github.com/RVC-Boss/GPT-SoVITS) v2Pro voice model for ATRI (from ATRI -My Dear Moments-), capable of synthesizing speech in Japanese, Chinese, and English.

	### Files

	- `ATR_e8_s3952.pth` — Fine-tuned SoVITS model weights (8 epochs, 3952 steps)
	- `ref_audio.wav` — Reference audio for inference
	- `api_atri.py` — FastAPI-based TTS inference server

	### Usage

	1. Clone and set up [GPT-SoVITS](https://github.com/RVC-Boss/GPT-SoVITS) following its instructions.
	2. Download the GPT pretrained model `s1v3.ckpt` from GPT-SoVITS (included in its pretrained models).
	3. Place `ATR_e8_s3952.pth` and `ref_audio.wav` in your preferred location.
	4. Update the paths in `api_atri.py` (replace `/path/to/` placeholders with actual paths).
	5. Run the API server:

	```bash
	cd /path/to/GPT-SoVITS
	python api_atri.py
	```

	API docs will be available at `http://127.0.0.1:9880/docs`.

	### API Endpoints

	\| Endpoint \| Method \| Description \|
	\|---\|---\|---\|
	\| `/health` \| GET \| Health check \|
	\| `/tts` \| POST \| Text-to-speech (returns full audio) \|
	\| `/tts/stream` \| POST \| Streaming text-to-speech \|

	### Reference Audio

	- Text: わたしはマスターの所有物ですので。勝手に売買するのは違法です
	- Language: Japanese

	### License

	This project is licensed under [AGPL-3.0](LICENSE), consistent with [GPT-SoVITS](https://github.com/RVC-Boss/GPT-SoVITS).