| # SYSPIN Hackathon TTS API Documentation | |
| ## Overview | |
| This API provides a Text-to-Speech (TTS) service that converts input text into speech audio. It supports multiple Indian languages and offers voice customization through predefined male and female speaker references. | |
| --- | |
| ## Endpoint: `/Get_Inference` | |
| * **Method**: `GET` | |
| * **Description**: Generates speech audio from the provided text using the specified language and speaker. | |
| ### Query Parameters | |
| | Parameter | Type | Required | Description | | |
| | --------- | ------ | -------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | |
| | `text` | string | Yes | The input text to be converted into speech. | | |
| | `lang` | string | Yes | The language of the input text. Acceptable values include: `bhojpuri`, `bengali`, `english`, `gujarati`, `hindi`, `chhattisgarhi`, `kannada`, `magahi`, `maithili`, `marathi`, `telugu`. | | |
| | `speaker` | string | Yes | The desired speaker's voice. Format: `<language>_<gender>`. For example: `hindi_male`, `english_female`. Refer to the available speakers below. | | |
| ### Available Speakers | |
| | Language | Language codes | Male Speaker | Female Speaker | | |
| | ------------- | -------- | ------------------- | --------------------- | | |
| | chhattisgarhi | hne | chhattisgarhi\_male | chhattisgarhi\_female | | |
| | kannada | kn | kannada\_male | kannada\_female | | |
| | maithili | mai | maithili\_male | maithili\_female | | |
| | telugu | te | telugu\_male | telugu\_female | | |
| | bengali | bn | bengali\_male | bengali\_female | | |
| | bhojpuri | bho | bhojpuri\_male | bhojpuri\_female | | |
| | marathi | mr | marathi\_male | marathi\_female | | |
| | gujarati | gu | gujarati\_male | gujarati\_female | | |
| | hindi | hi | hindi\_male | hindi\_female | | |
| | magahi | mag | magahi\_male | magahi\_female | | |
| | english | en | english\_male | english\_female | | |
| ### Responses | |
| * **200 OK**: Returns a WAV audio file as a streaming response containing the synthesized speech. | |
| * **422 Unprocessable Entity**: Returned when: | |
| * Any of the required query parameters (`text`, `lang`, `speaker`) are missing. | |
| * The specified `lang` is not supported. | |
| * The specified `speaker` is not available. | |
| ## Running the Server | |
| To start the FastAPI server: | |
| ```bash | |
| docker build -t your_image_name ./ | |
| docker run -d -v /path/to/this/code/dir/:/app/ -p 8080:8080 your_image_name API_main.py | |
| ``` | |
| ## Hosting on a GPU | |
| To run your FastAPI-based Text-to-Speech (TTS) server inside a Docker container with GPU support, follow these steps: | |
| --- | |
| ## Prerequisites | |
| 1. **NVIDIA GPU**: Ensure your system has an NVIDIA GPU installed. | |
| 2. **NVIDIA Drivers**: Install the appropriate NVIDIA drivers for your GPU. | |
| 3. **Docker**: Install Docker on your system. | |
| 4. **NVIDIA Container Toolkit**: Install the NVIDIA Container Toolkit to enable GPU support in Docker containers. | |
| --- | |
| ## Installation Steps | |
| ### 1. Install NVIDIA Drivers | |
| Ensure that the NVIDIA drivers compatible with your GPU are installed on your system. | |
| ### 2. Install Docker | |
| If Docker is not already installed, you can install it by following the official Docker installation guide for your operating system. | |
| ### 3. Install NVIDIA Container Toolkit | |
| The NVIDIA Container Toolkit allows Docker containers to utilize the GPU. | |
| **For Ubuntu:** | |
| ```bash | |
| # Add the package repositories | |
| distribution=$(. /etc/os-release;echo $ID$VERSION_ID) | |
| curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add - | |
| curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | \ | |
| sudo tee /etc/apt/sources.list.d/nvidia-docker.list | |
| # Update the package lists | |
| sudo apt-get update | |
| # Install the NVIDIA Container Toolkit | |
| sudo apt-get install -y nvidia-container-toolkit | |
| # Restart the Docker daemon to apply changes | |
| sudo systemctl restart docker | |
| ``` | |
| **For other operating systems:** Refer to the [NVIDIA Container Toolkit installation guide](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/install-guide.html) for detailed instructions. | |
| ### 4. Verify GPU Access in Docker | |
| To confirm that Docker can access your GPU, run the following command: | |
| ```bash | |
| docker run --rm --gpus all nvidia/cuda:12.4.0-base-ubuntu22.04 nvidia-smi | |
| ``` | |
| ## Running Your FastAPI TTS Server with GPU Support | |
| Assuming your FastAPI TTS application is containerized and ready to run: | |
| 1. **Build Your Docker Image** | |
| Navigate to the directory containing your `Dockerfile` and build the Docker image: | |
| ```bash | |
| docker build -t your_image_name . | |
| ``` | |
| 2. **Run the Docker Container with GPU Support** | |
| Start the container with GPU access enabled: | |
| ```bash | |
| docker run --gpus all -p 8080:8080 -v /path/to/this/code/dir/:/app/ your_image_name API_main.py | |
| ``` | |
| ## Example API Call | |
| ```python | |
| import requests | |
| # Define the base URL of your API | |
| base_url = 'http://localhost:8080/Get_Inference' | |
| # Set up the query parameters | |
| params = { | |
| 'text': 'ಮಾದರಿಯು ಸರಿಯಾಗಿ ಕಾರ್ಯನಿರ್ವಹಿಸುತ್ತಿದೆಯೇ ಎಂದು ಖಚಿತಪಡಿಸಿಕೊಳ್ಳಲು ಬಳಸಲಾಗುವ ಪರೀಕ್ಷಾ ವಾಕ್ಯ ಇದು.', | |
| 'lang': 'kannada', | |
| 'speaker': 'bengali_female' | |
| } | |
| # Send the GET request | |
| response = requests.get(base_url, params=params) | |
| # Check if the request was successful | |
| if response.status_code == 200: | |
| # Save the audio content to a file | |
| with open('output.wav', 'wb') as f: | |
| f.write(response.content) | |
| print("Audio saved as 'output.wav'") | |
| else: | |
| # Print the error message | |
| print(f"Request failed with status code {response.status_code}") | |
| print("Response:", response.text) | |
| ``` | |