SpireLab's picture
Update README.md
a35c500 verified
---
license: mit
language:
- hi
- en
- bn
- gu
- te
- mr
---
# SYSPIN Hackathon TTS API Documentation
## Overview
This API provides a Text-to-Speech (TTS) service that converts input text into speech audio. It supports multiple Indian languages and offers voice customization through speaker references provided by the user .
---
## Endpoint: `/Get_Inference`
* **Method**: `GET`
* **Description**: Generates speech audio from the provided text using the specified language and speaker reference file.
### Query Parameters
| Parameter | Type | Required | Description |
| --------- | ------ | -------- | ----------- |
| `text` | string | Yes | The input text to be converted into speech.|
| `lang` | string | Yes | The language of the input text. Acceptable values include: `bhojpuri`, `bengali`, `english`, `gujarati`, `hindi`, `chhattisgarhi`, `kannada`, `magahi`, `maithili`, `marathi`, `telugu`. |
| `speaker_wav` | WAV file (Bytes) | Yes | Must be a WAV file |
### Available Languages
| Language | Language codes |
| --------- | ---------------- |
| chhattisgarhi | hne |
| kannada | kn |
| maithili | mai |
| telugu | te |
| bengali | bn |
| bhojpuri | bho |
| marathi | mr |
| gujarati | gu |
| hindi | hi |
| magahi | mag |
| english | en |
### Responses
* **200 OK**: Returns a WAV audio file as a streaming response containing the synthesized speech.
* **422 Unprocessable Entity**: Returned when:
* Any of the required query parameters (`text`, `lang`, `speaker_wav`) are missing.
* The specified `lang` is not supported.
* The specified `speaker_wav` is not available.
## Running the Server
To start the FastAPI server:
```bash
docker build -t your_image_name ./
docker run -d -v /path/to/this/code/dir/:/app/ -p 8080:8080 your_image_name API_main.py
```
## Hosting on a GPU
To run your FastAPI-based Text-to-Speech (TTS) server inside a Docker container with GPU support, follow these steps:
---
## Prerequisites
1. **NVIDIA GPU**: Ensure your system has an NVIDIA GPU installed.
2. **NVIDIA Drivers**: Install the appropriate NVIDIA drivers for your GPU.
3. **Docker**: Install Docker on your system.
4. **NVIDIA Container Toolkit**: Install the NVIDIA Container Toolkit to enable GPU support in Docker containers.
---
## Installation Steps
### 1. Install NVIDIA Drivers
Ensure that the NVIDIA drivers compatible with your GPU are installed on your system.
### 2. Install Docker
If Docker is not already installed, you can install it by following the official Docker installation guide for your operating system.
### 3. Install NVIDIA Container Toolkit
The NVIDIA Container Toolkit allows Docker containers to utilize the GPU.
**For Ubuntu:**
```bash
# Add the package repositories
distribution=$(. /etc/os-release;echo $ID$VERSION_ID)
curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add -
curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | \
sudo tee /etc/apt/sources.list.d/nvidia-docker.list
# Update the package lists
sudo apt-get update
# Install the NVIDIA Container Toolkit
sudo apt-get install -y nvidia-container-toolkit
# Restart the Docker daemon to apply changes
sudo systemctl restart docker
```
**For other operating systems:** Refer to the [NVIDIA Container Toolkit installation guide](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/install-guide.html) for detailed instructions.
### 4. Verify GPU Access in Docker
To confirm that Docker can access your GPU, run the following command:
```bash
docker run --rm --gpus all nvidia/cuda:12.4.0-base-ubuntu22.04 nvidia-smi
```
## Running Your FastAPI TTS Server with GPU Support
Assuming your FastAPI TTS application is containerized and ready to run:
1. **Build Your Docker Image**
Navigate to the directory containing your `Dockerfile` and build the Docker image:
```bash
docker build -t your_image_name .
```
2. **Run the Docker Container with GPU Support**
Start the container with GPU access enabled:
```bash
docker run --gpus all -p 8080:8080 -v /path/to/this/code/dir/:/app/ your_image_name API_main.py
```
## Example API Call
```python
import requests
# Define the base URL of your API
base_url = 'http://localhost:8080/Get_Inference'
# Set up the query parameters
WavPath = 'path/to/wavfile.wav'
params = {
'text': 'ಮಾದರಿಯು ಸರಿಯಾಗಿ ಕಾರ್ಯನಿರ್ವಹಿಸುತ್ತಿದೆಯೇ ಎಂದು ಖಚಿತಪಡಿಸಿಕೊಳ್ಳಲು ಬಳಸಲಾಗುವ ಪರೀಕ್ಷಾ ವಾಕ್ಯ ಇದು.',
'lang': 'kannada',
}
# Send the GET request
with open(WavPath, "rb") as AudioFile:
response = requests.get(base_url, params = params, files = { 'speaker_wav': AudioFile.read() })
# Check if the request was successful
if response.status_code == 200:
# Save the audio content to a file
with open('output.wav', 'wb') as f:
f.write(response.content)
print("Audio saved as 'output.wav'")
else:
# Print the error message
print(f"Request failed with status code {response.status_code}")
print("Response:", response.text)
```