Spaces:

aoxiang1221
/

gpt-sovits

Sleeping

App Files Files Community

gpt-sovits / api_doc.md

nekoaoxiang

添加代码至 df4c937

558c90a over 1 year ago

preview code

raw

history blame contribute delete

3.46 kB

	## Overview

	This document aims to introduce how to use our Text-to-Speech API, including making requests via GET and POST methods. This API supports converting text into the voice of specified characters and supports different languages and emotional expressions.

	## Character and Emotion List

	To obtain the supported characters and their corresponding emotions, please visit the following URL:

	- URL: `http://127.0.0.1:5000/character_list`
	- Returns: A JSON format list of characters and corresponding emotions
	- Method: `GET`

	```
	{
	"Hanabi": [
	"default",
	"Normal",
	"Yandere",
	],
	"Hutao": [
	"default"
	]
	}
	```

	## Regarding Aliases

	From version 2.2.4, an alias system was added. Detailed allowed aliases can be found in `Inference/params_config.json`.

	## Text-to-Speech

	- URL: `http://127.0.0.1:5000/tts`
	- Returns: Audio on success. Error message on failure.
	- Method: `GET`/`POST`

	### GET Method

	#### Format

	```
	http://127.0.0.1:5000/tts?character={{characterName}}&text={{text}}
	```

	- Parameter explanation:
	- `character`: The name of the character folder, pay attention to case sensitivity, full/half width, and language (Chinese/English).
	- `text`: The text to be converted, URL encoding is recommended.
	- Optional parameters include `text_language`, `format`, `top_k`, `top_p`, `batch_size`, `speed`, `temperature`, `emotion`, `save_temp`, and `stream`, detailed explanations are provided in the POST section below.
	- From version 2.2.4, an alias system was added, with detailed allowed aliases found in `Inference/params_config.json`.

	### POST Method

	#### JSON Package Format

	##### All Parameters

	```
	{
	"method": "POST",
	"body": {
	"character": "${chaName}",
	"emotion": "${Emotion}",
	"text": "${speakText}",
	"text_language": "${textLanguage}",
	"batch_size": ${batch_size},
	"speed": ${speed},
	"top_k": ${topK},
	"top_p": ${topP},
	"temperature": ${temperature},
	"stream": "${stream}",
	"format": "${Format}",
	"save_temp": "${saveTemp}"
	}
	}
	```

	You can omit one or more items. From version 2.2.4, an alias system was introduced, detailed allowed aliases can be found in `Inference/params_config.json`.

	##### Minimal Data:

	```
	{
	"method": "POST",
	"body": {
	"text": "${speakText}"
	}
	}
	```

	##### Parameter Explanation

	- text: The text to be converted, URL encoding is recommended.
	- character: Character folder name, pay attention to case sensitivity, full/half width, and language.
	- emotion: Character emotion, must be an actually supported emotion of the character, otherwise, the default emotion will be used.
	- text_language: Text language (auto / zh / en / ja), default is multilingual mixed.
	- top_k, top_p, temperature: GPT model parameters, no need to modify if unfamiliar.

	- batch_size: How many batches at a time, can be increased for faster processing if you have a powerful computer, integer, default is 1.
	- speed: Speech speed, default is 1.0.
	- save_temp: Whether to save temporary files, when true, the backend will save the generated audio, and subsequent identical requests will directly return that data, default is false.
	- stream: Whether to stream, when true, audio will be returned sentence by sentence, default is false.
	- format: Format, default is WAV, allows MP3/ WAV/ OGG.