|
|
--- |
|
|
license: mit |
|
|
language: |
|
|
- hi |
|
|
- en |
|
|
- bn |
|
|
- gu |
|
|
- te |
|
|
- mr |
|
|
--- |
|
|
# SYSPIN Hackathon TTS API Documentation |
|
|
|
|
|
## Overview |
|
|
|
|
|
This API provides a Text-to-Speech (TTS) service that converts input text into speech audio. It supports multiple Indian languages and offers voice customization through speaker references provided by the user . |
|
|
|
|
|
--- |
|
|
|
|
|
## Endpoint: `/Get_Inference` |
|
|
|
|
|
* **Method**: `GET` |
|
|
* **Description**: Generates speech audio from the provided text using the specified language and speaker reference file. |
|
|
|
|
|
### Query Parameters |
|
|
|
|
|
| Parameter | Type | Required | Description | |
|
|
| --------- | ------ | -------- | ----------- | |
|
|
| `text` | string | Yes | The input text to be converted into speech.| |
|
|
| `lang` | string | Yes | The language of the input text. Acceptable values include: `bhojpuri`, `bengali`, `english`, `gujarati`, `hindi`, `chhattisgarhi`, `kannada`, `magahi`, `maithili`, `marathi`, `telugu`. | |
|
|
| `speaker_wav` | WAV file (Bytes) | Yes | Must be a WAV file | |
|
|
|
|
|
### Available Languages |
|
|
|
|
|
| Language | Language codes | |
|
|
| --------- | ---------------- | |
|
|
| chhattisgarhi | hne | |
|
|
| kannada | kn | |
|
|
| maithili | mai | |
|
|
| telugu | te | |
|
|
| bengali | bn | |
|
|
| bhojpuri | bho | |
|
|
| marathi | mr | |
|
|
| gujarati | gu | |
|
|
| hindi | hi | |
|
|
| magahi | mag | |
|
|
| english | en | |
|
|
|
|
|
### Responses |
|
|
|
|
|
* **200 OK**: Returns a WAV audio file as a streaming response containing the synthesized speech. |
|
|
|
|
|
* **422 Unprocessable Entity**: Returned when: |
|
|
|
|
|
* Any of the required query parameters (`text`, `lang`, `speaker_wav`) are missing. |
|
|
* The specified `lang` is not supported. |
|
|
* The specified `speaker_wav` is not available. |
|
|
|
|
|
|
|
|
|
|
|
## Running the Server |
|
|
|
|
|
To start the FastAPI server: |
|
|
|
|
|
```bash |
|
|
docker build -t your_image_name ./ |
|
|
docker run -d -v /path/to/this/code/dir/:/app/ -p 8080:8080 your_image_name API_main.py |
|
|
``` |
|
|
|
|
|
## Hosting on a GPU |
|
|
|
|
|
To run your FastAPI-based Text-to-Speech (TTS) server inside a Docker container with GPU support, follow these steps: |
|
|
|
|
|
--- |
|
|
|
|
|
## Prerequisites |
|
|
|
|
|
1. **NVIDIA GPU**: Ensure your system has an NVIDIA GPU installed. |
|
|
|
|
|
2. **NVIDIA Drivers**: Install the appropriate NVIDIA drivers for your GPU. |
|
|
|
|
|
3. **Docker**: Install Docker on your system. |
|
|
|
|
|
4. **NVIDIA Container Toolkit**: Install the NVIDIA Container Toolkit to enable GPU support in Docker containers. |
|
|
|
|
|
--- |
|
|
|
|
|
## Installation Steps |
|
|
|
|
|
### 1. Install NVIDIA Drivers |
|
|
|
|
|
Ensure that the NVIDIA drivers compatible with your GPU are installed on your system. |
|
|
|
|
|
### 2. Install Docker |
|
|
|
|
|
If Docker is not already installed, you can install it by following the official Docker installation guide for your operating system. |
|
|
|
|
|
### 3. Install NVIDIA Container Toolkit |
|
|
|
|
|
The NVIDIA Container Toolkit allows Docker containers to utilize the GPU. |
|
|
|
|
|
**For Ubuntu:** |
|
|
|
|
|
```bash |
|
|
# Add the package repositories |
|
|
distribution=$(. /etc/os-release;echo $ID$VERSION_ID) |
|
|
curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add - |
|
|
curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | \ |
|
|
sudo tee /etc/apt/sources.list.d/nvidia-docker.list |
|
|
|
|
|
# Update the package lists |
|
|
sudo apt-get update |
|
|
|
|
|
# Install the NVIDIA Container Toolkit |
|
|
sudo apt-get install -y nvidia-container-toolkit |
|
|
|
|
|
# Restart the Docker daemon to apply changes |
|
|
sudo systemctl restart docker |
|
|
``` |
|
|
|
|
|
**For other operating systems:** Refer to the [NVIDIA Container Toolkit installation guide](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/install-guide.html) for detailed instructions. |
|
|
|
|
|
### 4. Verify GPU Access in Docker |
|
|
|
|
|
To confirm that Docker can access your GPU, run the following command: |
|
|
|
|
|
```bash |
|
|
docker run --rm --gpus all nvidia/cuda:12.4.0-base-ubuntu22.04 nvidia-smi |
|
|
``` |
|
|
|
|
|
|
|
|
## Running Your FastAPI TTS Server with GPU Support |
|
|
|
|
|
Assuming your FastAPI TTS application is containerized and ready to run: |
|
|
|
|
|
1. **Build Your Docker Image** |
|
|
|
|
|
Navigate to the directory containing your `Dockerfile` and build the Docker image: |
|
|
|
|
|
```bash |
|
|
docker build -t your_image_name . |
|
|
``` |
|
|
|
|
|
|
|
|
2. **Run the Docker Container with GPU Support** |
|
|
|
|
|
Start the container with GPU access enabled: |
|
|
|
|
|
```bash |
|
|
docker run --gpus all -p 8080:8080 -v /path/to/this/code/dir/:/app/ your_image_name API_main.py |
|
|
``` |
|
|
|
|
|
## Example API Call |
|
|
|
|
|
```python |
|
|
import requests |
|
|
|
|
|
# Define the base URL of your API |
|
|
base_url = 'http://localhost:8080/Get_Inference' |
|
|
|
|
|
# Set up the query parameters |
|
|
WavPath = 'path/to/wavfile.wav' |
|
|
|
|
|
params = { |
|
|
'text': 'ಮಾದರಿಯು ಸರಿಯಾಗಿ ಕಾರ್ಯನಿರ್ವಹಿಸುತ್ತಿದೆಯೇ ಎಂದು ಖಚಿತಪಡಿಸಿಕೊಳ್ಳಲು ಬಳಸಲಾಗುವ ಪರೀಕ್ಷಾ ವಾಕ್ಯ ಇದು.', |
|
|
'lang': 'kannada', |
|
|
} |
|
|
|
|
|
# Send the GET request |
|
|
with open(WavPath, "rb") as AudioFile: |
|
|
response = requests.get(base_url, params = params, files = { 'speaker_wav': AudioFile.read() }) |
|
|
|
|
|
# Check if the request was successful |
|
|
if response.status_code == 200: |
|
|
# Save the audio content to a file |
|
|
with open('output.wav', 'wb') as f: |
|
|
f.write(response.content) |
|
|
print("Audio saved as 'output.wav'") |
|
|
else: |
|
|
# Print the error message |
|
|
print(f"Request failed with status code {response.status_code}") |
|
|
print("Response:", response.text) |
|
|
``` |