--- license: mit language: - hi - en - bn - gu - te - mr --- # SYSPIN Hackathon TTS API Documentation ## Overview This API provides a Text-to-Speech (TTS) service that converts input text into speech audio. It supports multiple Indian languages and offers voice customization through speaker references provided by the user . --- ## Endpoint: `/Get_Inference` * **Method**: `GET` * **Description**: Generates speech audio from the provided text using the specified language and speaker reference file. ### Query Parameters | Parameter | Type | Required | Description | | --------- | ------ | -------- | ----------- | | `text` | string | Yes | The input text to be converted into speech.| | `lang` | string | Yes | The language of the input text. Acceptable values include: `bhojpuri`, `bengali`, `english`, `gujarati`, `hindi`, `chhattisgarhi`, `kannada`, `magahi`, `maithili`, `marathi`, `telugu`. | | `speaker_wav` | WAV file (Bytes) | Yes | Must be a WAV file | ### Available Languages | Language | Language codes | | --------- | ---------------- | | chhattisgarhi | hne | | kannada | kn | | maithili | mai | | telugu | te | | bengali | bn | | bhojpuri | bho | | marathi | mr | | gujarati | gu | | hindi | hi | | magahi | mag | | english | en | ### Responses * **200 OK**: Returns a WAV audio file as a streaming response containing the synthesized speech. * **422 Unprocessable Entity**: Returned when: * Any of the required query parameters (`text`, `lang`, `speaker_wav`) are missing. * The specified `lang` is not supported. * The specified `speaker_wav` is not available. ## Running the Server To start the FastAPI server: ```bash docker build -t your_image_name ./ docker run -d -v /path/to/this/code/dir/:/app/ -p 8080:8080 your_image_name API_main.py ``` ## Hosting on a GPU To run your FastAPI-based Text-to-Speech (TTS) server inside a Docker container with GPU support, follow these steps: --- ## Prerequisites 1. **NVIDIA GPU**: Ensure your system has an NVIDIA GPU installed. 2. **NVIDIA Drivers**: Install the appropriate NVIDIA drivers for your GPU. 3. **Docker**: Install Docker on your system. 4. **NVIDIA Container Toolkit**: Install the NVIDIA Container Toolkit to enable GPU support in Docker containers. --- ## Installation Steps ### 1. Install NVIDIA Drivers Ensure that the NVIDIA drivers compatible with your GPU are installed on your system. ### 2. Install Docker If Docker is not already installed, you can install it by following the official Docker installation guide for your operating system. ### 3. Install NVIDIA Container Toolkit The NVIDIA Container Toolkit allows Docker containers to utilize the GPU. **For Ubuntu:** ```bash # Add the package repositories distribution=$(. /etc/os-release;echo $ID$VERSION_ID) curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add - curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | \ sudo tee /etc/apt/sources.list.d/nvidia-docker.list # Update the package lists sudo apt-get update # Install the NVIDIA Container Toolkit sudo apt-get install -y nvidia-container-toolkit # Restart the Docker daemon to apply changes sudo systemctl restart docker ``` **For other operating systems:** Refer to the [NVIDIA Container Toolkit installation guide](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/install-guide.html) for detailed instructions. ### 4. Verify GPU Access in Docker To confirm that Docker can access your GPU, run the following command: ```bash docker run --rm --gpus all nvidia/cuda:12.4.0-base-ubuntu22.04 nvidia-smi ``` ## Running Your FastAPI TTS Server with GPU Support Assuming your FastAPI TTS application is containerized and ready to run: 1. **Build Your Docker Image** Navigate to the directory containing your `Dockerfile` and build the Docker image: ```bash docker build -t your_image_name . ``` 2. **Run the Docker Container with GPU Support** Start the container with GPU access enabled: ```bash docker run --gpus all -p 8080:8080 -v /path/to/this/code/dir/:/app/ your_image_name API_main.py ``` ## Example API Call ```python import requests # Define the base URL of your API base_url = 'http://localhost:8080/Get_Inference' # Set up the query parameters WavPath = 'path/to/wavfile.wav' params = { 'text': 'ಮಾದರಿಯು ಸರಿಯಾಗಿ ಕಾರ್ಯನಿರ್ವಹಿಸುತ್ತಿದೆಯೇ ಎಂದು ಖಚಿತಪಡಿಸಿಕೊಳ್ಳಲು ಬಳಸಲಾಗುವ ಪರೀಕ್ಷಾ ವಾಕ್ಯ ಇದು.', 'lang': 'kannada', } # Send the GET request with open(WavPath, "rb") as AudioFile: response = requests.get(base_url, params = params, files = { 'speaker_wav': AudioFile.read() }) # Check if the request was successful if response.status_code == 200: # Save the audio content to a file with open('output.wav', 'wb') as f: f.write(response.content) print("Audio saved as 'output.wav'") else: # Print the error message print(f"Request failed with status code {response.status_code}") print("Response:", response.text) ```