Update README.md

a35c500 verified about 1 month ago

5.23 kB

	---
	license: mit
	language:
	- hi
	- en
	- bn
	- gu
	- te
	- mr
	---
	# SYSPIN Hackathon TTS API Documentation

	## Overview

	This API provides a Text-to-Speech (TTS) service that converts input text into speech audio. It supports multiple Indian languages and offers voice customization through speaker references provided by the user .

	---

	## Endpoint: `/Get_Inference`

	* Method: `GET`
	* Description: Generates speech audio from the provided text using the specified language and speaker reference file.

	### Query Parameters

	\| Parameter \| Type \| Required \| Description \|
	\| --------- \| ------ \| -------- \| ----------- \|
	\| `text` \| string \| Yes \| The input text to be converted into speech.\|
	\| `lang` \| string \| Yes \| The language of the input text. Acceptable values include: `bhojpuri`, `bengali`, `english`, `gujarati`, `hindi`, `chhattisgarhi`, `kannada`, `magahi`, `maithili`, `marathi`, `telugu`. \|
	\| `speaker_wav` \| WAV file (Bytes) \| Yes \| Must be a WAV file \|

	### Available Languages

	\| Language \| Language codes \|
	\| --------- \| ---------------- \|
	\| chhattisgarhi \| hne \|
	\| kannada \| kn \|
	\| maithili \| mai \|
	\| telugu \| te \|
	\| bengali \| bn \|
	\| bhojpuri \| bho \|
	\| marathi \| mr \|
	\| gujarati \| gu \|
	\| hindi \| hi \|
	\| magahi \| mag \|
	\| english \| en \|

	### Responses

	* 200 OK: Returns a WAV audio file as a streaming response containing the synthesized speech.

	* 422 Unprocessable Entity: Returned when:

	* Any of the required query parameters (`text`, `lang`, `speaker_wav`) are missing.
	* The specified `lang` is not supported.
	* The specified `speaker_wav` is not available.



	## Running the Server

	To start the FastAPI server:

	```bash
	docker build -t your_image_name ./
	docker run -d -v /path/to/this/code/dir/:/app/ -p 8080:8080 your_image_name API_main.py
	```

	## Hosting on a GPU

	To run your FastAPI-based Text-to-Speech (TTS) server inside a Docker container with GPU support, follow these steps:

	---

	## Prerequisites

	1. NVIDIA GPU: Ensure your system has an NVIDIA GPU installed.

	2. NVIDIA Drivers: Install the appropriate NVIDIA drivers for your GPU.

	3. Docker: Install Docker on your system.

	4. NVIDIA Container Toolkit: Install the NVIDIA Container Toolkit to enable GPU support in Docker containers.

	---

	## Installation Steps

	### 1. Install NVIDIA Drivers

	Ensure that the NVIDIA drivers compatible with your GPU are installed on your system.

	### 2. Install Docker

	If Docker is not already installed, you can install it by following the official Docker installation guide for your operating system.

	### 3. Install NVIDIA Container Toolkit

	The NVIDIA Container Toolkit allows Docker containers to utilize the GPU.

	For Ubuntu:

	```bash
	# Add the package repositories
	distribution=$(. /etc/os-release;echo $ID$VERSION_ID)
	curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey \| sudo apt-key add -
	curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list \| \
	sudo tee /etc/apt/sources.list.d/nvidia-docker.list

	# Update the package lists
	sudo apt-get update

	# Install the NVIDIA Container Toolkit
	sudo apt-get install -y nvidia-container-toolkit

	# Restart the Docker daemon to apply changes
	sudo systemctl restart docker
	```

	For other operating systems: Refer to the [NVIDIA Container Toolkit installation guide](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/install-guide.html) for detailed instructions.

	### 4. Verify GPU Access in Docker

	To confirm that Docker can access your GPU, run the following command:

	```bash
	docker run --rm --gpus all nvidia/cuda:12.4.0-base-ubuntu22.04 nvidia-smi
	```


	## Running Your FastAPI TTS Server with GPU Support

	Assuming your FastAPI TTS application is containerized and ready to run:

	1. Build Your Docker Image

	Navigate to the directory containing your `Dockerfile` and build the Docker image:

	```bash
	docker build -t your_image_name .
	```


	2. Run the Docker Container with GPU Support

	Start the container with GPU access enabled:

	```bash
	docker run --gpus all -p 8080:8080 -v /path/to/this/code/dir/:/app/ your_image_name API_main.py
	```

	## Example API Call

	```python
	import requests

	# Define the base URL of your API
	base_url = 'http://localhost:8080/Get_Inference'

	# Set up the query parameters
	WavPath = 'path/to/wavfile.wav'

	params = {
	'text': 'ಮಾದರಿಯು ಸರಿಯಾಗಿ ಕಾರ್ಯನಿರ್ವಹಿಸುತ್ತಿದೆಯೇ ಎಂದು ಖಚಿತಪಡಿಸಿಕೊಳ್ಳಲು ಬಳಸಲಾಗುವ ಪರೀಕ್ಷಾ ವಾಕ್ಯ ಇದು.',
	'lang': 'kannada',
	}

	# Send the GET request
	with open(WavPath, "rb") as AudioFile:
	response = requests.get(base_url, params = params, files = { 'speaker_wav': AudioFile.read() })

	# Check if the request was successful
	if response.status_code == 200:
	# Save the audio content to a file
	with open('output.wav', 'wb') as f:
	f.write(response.content)
	print("Audio saved as 'output.wav'")
	else:
	# Print the error message
	print(f"Request failed with status code {response.status_code}")
	print("Response:", response.text)
	```