ttsds
/

gpt-sovits

speech-synthesis

Model card Files Files and versions

gpt-sovits / README.md

cdminix's picture

Add gpt-sovits weights

b40ab2e verified 4 days ago

|

history blame contribute delete

2.66 kB

	---
	# Generated at 2026-02-02T09:00:22Z from templates/weights/README.md.j2
	license: mit
	language:
	- eng
	- zho
	- jpn
	tags:
	- tts
	- text-to-speech
	- speech-synthesis
	- voice-cloning
	library_name: ttsdb
	pipeline_tag: text-to-speech
	base_model:
	- lj1995/GPT-SoVITS

	---

	# GPT-SoVITS

	> This is a mirror of the original weights for use with [TTSDB](https://github.com/ttsds/ttsdb).
	>
	> Original weights: [https://huggingface.co/lj1995/GPT-SoVITS](https://huggingface.co/lj1995/GPT-SoVITS)
	> Original code: [https://github.com/RVC-Boss/GPT-SoVITS](https://github.com/RVC-Boss/GPT-SoVITS)


	GPT-SoVITS is a powerful few-shot voice conversion and text-to-speech system by [RVC-Boss](https://github.com/RVC-Boss).
	It achieves high-quality voice cloning with just 1 minute of training data, supporting zero-shot and few-shot TTS
	with cross-lingual synthesis capabilities.



	## Original Work

	This model was created by the original authors. Please cite their work if you use this model:


	```bibtex
	@misc{RVCBoss2024,
	author = {RVC-Boss},
	title = {GPT-SoVITS: 1 min voice data can also be used to train a good TTS model},
	year = {2024},
	publisher = {GitHub},
	journal = {GitHub repository},
	howpublished = {\url{https://github.com/RVC-Boss/GPT-SoVITS}},
	}
	```




	## Installation

	```bash
	pip install ttsdb-gpt-sovits
	```

	## Usage

	```python
	from ttsdb_gpt_sovits import GPTSoVITS

	# Load the model (downloads weights automatically)
	model = GPTSoVITS(model_id="ttsds/gpt-sovits")

	# Synthesize speech
	audio, sample_rate = model.synthesize(
	text="Hello, this is a test of GPT-SoVITS.",
	reference_audio="path/to/reference.wav",
	text_reference="Transcript of the reference audio.",
	language="eng",
	)

	# Save the output
	model.save_audio(audio, sample_rate, "output.wav")
	```

	## Model Details

	\| Property \| Value \|
	\|----------\|-------\|
	\| Sample Rate \| 32000 Hz \|
	\| Parameters \| 167M \|
	\| Architecture \| Autoregressive, Non-Autoregressive, GPT, VITS \|
	\| Languages \| English, Chinese, Japanese \|
	\| Release Date \| 2024-01-16 \|


	### Training Data


	- [Internal Dataset]() (2000 hours)




	## License

	- Weights: MIT License
	- Code: MIT License

	Please refer to the original repositories for full license terms.

	## Links

	- Original Code: [https://github.com/RVC-Boss/GPT-SoVITS](https://github.com/RVC-Boss/GPT-SoVITS)
	- Original Weights: [https://huggingface.co/lj1995/GPT-SoVITS](https://huggingface.co/lj1995/GPT-SoVITS)
	- TTSDB Package: [ttsdb-gpt-sovits](https://pypi.org/project/ttsdb-gpt-sovits/)
	- TTSDB GitHub: [https://github.com/ttsds/ttsdb](https://github.com/ttsds/ttsdb)