audio-service / deployment.md
uncertainrods's picture
init
fdec215

Hugging Face Spaces Deployment Guide

This guide walks you through deploying the Puja Verification Service on a Hugging Face Docker Space.

Prerequisites

  1. A Hugging Face account.
  2. An active API Key from Groq (GROQ_API_KEY) for the LLM translation/matching. (Note: An HF token is no longer strictly required for our service because we migrated the heavy ASR model away from local PyTorch execution and into the public Gradio API!)

Deployment Steps

1. Create a New Space

  1. Log into your Hugging Face account.
  2. Go to your profile and click New Space.
  3. Fill out the details:
    • Space Name: e.g., puja-verification-api
    • License: Choose applicable (e.g., mit or openrail)
    • Select the Space SDK: Choose Docker
    • Docker Template: Choose Blank
    • Space Hardware: Since we removed the heavy local PyTorch dependencies, the Free CPU Basic Tier (16GB RAM / 2 vCPU) is more than enough.
    • Visibility: Public or Private.
  4. Click Create Space.

2. Configure Secrets

Your app needs the Groq API key injected securely as an environment variable before it boots.

  1. Inside your newly created Space, navigate to the Settings tab.
  2. Scroll down to the Variables and secrets section.
  3. Under Secrets, click New secret.
  4. Create the following secret:
    • Name: GROQ_API_KEY
    • Value: (Paste your Groq API key here)

3. Upload Codebase

Hugging Face Spaces are essentially Git repositories. You can upload the code directly via the UI or push via CLI.

Using via Web UI (Easiest)

  1. Go to the Files tab of your Space.
  2. Click + Add file > Upload files.
  3. Upload the following files and folders from your local machine:
    • Dockerfile
    • requirements.txt
    • app/ (Upload this entire directory containing all sub-folders services, schemas, utils, and main.py)
  4. Once you click Commit, the Space will automatically read the Dockerfile and begin building the container.

Using Git CLI (Advanced)

git clone https://huggingface.co/spaces/<your-username>/<your-space-name>
cd <your-space-name>
# Copy all project files into this directory
git add .
git commit -m "Initial commit for Puja verification API"
git push

4. Monitor Build & Boot

  1. Go back to the App tab on your Space.
  2. You should see a "Building..." status pill. You can click Logs to view the container build output. This will install ffmpeg and download the requirements.
  3. Once the status changes to "Running", your API is successfully deployed.

5. Access the Service

The API is now live. Because it is a FastAPI app, you can seamlessly interact with it via the Swagger /docs.

  • Click on the settings dropdown in your Space UI -> Embed this Space.
  • Obtain the direct URL (it usually looks like https://<org>-<space-name>.hf.space).
  • Navigate your browser to https://<org>-<space-name>.hf.space/docs to see the endpoints and upload test payloads.

🛠️ Performance Optimization Note

You will notice the container runs extraordinarily fast now. By migrating the ai4bharat 600M Conformer to an external Gradio API (rverma0631/MultilingualASR), we:

  1. Eliminated 6+ GB of Torch/ONNX weights from RAM.
  2. Slashed the Docker image size significantly, dropping install times from minutes to seconds.
  3. Preserved the identical NLP pipeline flawlessly.