Spaces:

mutisya
/

polyglot-backend-quant

Sleeping

App Files Files Community

mutisya commited on Oct 23, 2025

Commit

d6e8bff

verified ·

1 Parent(s): 62cb279

Deploy Polyglot backend with quantized models

Browse files

Files changed (4) hide show

Dockerfile +13 -4
README.md +52 -40
README.md.bak +40 -0
download_code.py +79 -0

Dockerfile CHANGED Viewed

@@ -20,10 +20,20 @@ RUN apt-get update && apt-get install -y \
 COPY requirements.txt .
 RUN pip install --no-cache-dir -r requirements.txt
-# Copy application code
-COPY app ./app
 COPY preload_models.py .
 # Set environment variables for caching
 ENV HF_HOME=/app/.cache
 ENV TRANSFORMERS_CACHE=/app/.cache
@@ -41,8 +51,7 @@ RUN mkdir -p $NUMBA_CACHE_DIR && chmod -R 777 $NUMBA_CACHE_DIR
 RUN mkdir -p /app/data/learning/users && chmod -R 777 /app/data
 # Download models using HF token from environment
-# HuggingFace Spaces automatically provides HUGGING_FACE_HUB_TOKEN
-ARG HUGGING_FACE_HUB_TOKEN
 RUN python preload_models.py $HUGGING_FACE_HUB_TOKEN || echo "Model preload skipped - will download on first use"
 # Expose port 7860 (HuggingFace Spaces standard)

 COPY requirements.txt .
 RUN pip install --no-cache-dir -r requirements.txt
+# Copy download script and preload script
+COPY download_code.py .
 COPY preload_models.py .
+# Download application code from private code space
+# CODE_SPACE_ID should be set as a Space secret (e.g., "mutisya/polyglot-backend-code")
+ARG CODE_SPACE_ID
+ARG HUGGING_FACE_HUB_TOKEN
+RUN if [ -n "$CODE_SPACE_ID" ] && [ -n "$HUGGING_FACE_HUB_TOKEN" ]; then \
+        python download_code.py "$CODE_SPACE_ID" "$HUGGING_FACE_HUB_TOKEN" || echo "Code download failed - using local files"; \
+    else \
+        echo "WARNING: CODE_SPACE_ID or token not provided - code must be copied locally"; \
+    fi
 # Set environment variables for caching
 ENV HF_HOME=/app/.cache
 ENV TRANSFORMERS_CACHE=/app/.cache
 RUN mkdir -p /app/data/learning/users && chmod -R 777 /app/data
 # Download models using HF token from environment
+# HuggingFace Spaces automatically provides HUGGING_FACE_HUB_TOKEN (already defined above)
 RUN python preload_models.py $HUGGING_FACE_HUB_TOKEN || echo "Model preload skipped - will download on first use"
 # Expose port 7860 (HuggingFace Spaces standard)

README.md CHANGED Viewed

@@ -1,40 +1,52 @@
----
-title: Polyglot Translation Backend
-emoji: 🌍
-colorFrom: blue
-colorTo: green
-sdk: docker
-pinned: false
-license: mit
-app_port: 7860
----
-# Polyglot Translation Backend - Quantized Models
-Real-time speech transcription and translation API with Socket.IO for WebSocket communication. This version uses INT8 quantized models for improved performance and reduced memory footprint.
-## Features
-- **Real-time Speech Recognition**: Support for English, Swahili, Kikuyu, Kamba, Kimeru, Luo, and Somali
-- **Translation**: Multi-language translation using NLLB models
-- **Text-to-Speech**: Generate speech in multiple languages
-- **WebSocket Support**: Real-time communication via Socket.IO
-- **Model Quantization**: INT8 dynamic quantization for faster inference
-## API Endpoints
-- `GET /health` - Health check endpoint
-- `WebSocket /` - Socket.IO connection for real-time communication
-## Environment
-This Space requires a HuggingFace token for model access. The token is automatically provided by HuggingFace Spaces when configured as a secret.
-## Technical Details
-- **Framework**: FastAPI with Socket.IO
-- **Models**:
-  - ASR: Whisper (English) and Wav2Vec2-BERT (African languages)
-  - Translation: NLLB-600M fine-tuned model
-  - TTS: VITS models for each language
-- **Optimization**: INT8 dynamic quantization via PyTorch

+---
+title: Polyglot Translation Backend
+emoji: 🌍
+colorFrom: blue
+colorTo: green
+sdk: docker
+pinned: false
+license: mit
+app_port: 7860
+---
+# Polyglot Translation Backend - Quantized Models
+Real-time speech transcription and translation API with Socket.IO for WebSocket communication. This version uses INT8 quantized models for improved performance and reduced memory footprint.
+## Features
+- **Real-time Speech Recognition**: Support for English, Swahili, Kikuyu, Kamba, Kimeru, Luo, and Somali
+- **Translation**: Multi-language translation using NLLB models
+- **Text-to-Speech**: Generate speech in multiple languages
+- **WebSocket Support**: Real-time communication via Socket.IO
+- **Model Quantization**: INT8 dynamic quantization for faster inference
+## API Endpoints
+- `GET /health` - Health check endpoint
+- `WebSocket /` - Socket.IO connection for real-time communication
+## Environment
+This Space requires the following secrets to be configured:
+- `HUGGING_FACE_HUB_TOKEN` - HuggingFace token for model access
+- `CODE_SPACE_ID` - ID of the private code space (e.g., "mutisya/polyglot-backend-code")
+### Code Space Architecture
+This Docker Space downloads the application code from a separate private Space during build time. This allows the Docker Space to be public while keeping the source code private.
+- **Public Docker Space** (this one): Contains only the Dockerfile and deployment configuration
+- **Private Code Space**: Contains the actual application code (`app/`) and data (`data/`)
+During the build process, the Dockerfile downloads the code from the private space using the HuggingFace Hub API.
+## Technical Details
+- **Framework**: FastAPI with Socket.IO
+- **Models**:
+  - ASR: Whisper (English) and Wav2Vec2-BERT (African languages)
+  - Translation: NLLB-600M fine-tuned model
+  - TTS: VITS models for each language
+- **Optimization**: INT8 dynamic quantization via PyTorch

README.md.bak ADDED Viewed

	@@ -0,0 +1,40 @@

+---
+title: Polyglot Translation Backend
+emoji: 🌍
+colorFrom: blue
+colorTo: green
+sdk: docker
+pinned: false
+license: mit
+app_port: 7860
+---
+# Polyglot Translation Backend - Quantized Models
+Real-time speech transcription and translation API with Socket.IO for WebSocket communication. This version uses INT8 quantized models for improved performance and reduced memory footprint.
+## Features
+- **Real-time Speech Recognition**: Support for English, Swahili, Kikuyu, Kamba, Kimeru, Luo, and Somali
+- **Translation**: Multi-language translation using NLLB models
+- **Text-to-Speech**: Generate speech in multiple languages
+- **WebSocket Support**: Real-time communication via Socket.IO
+- **Model Quantization**: INT8 dynamic quantization for faster inference
+## API Endpoints
+- `GET /health` - Health check endpoint
+- `WebSocket /` - Socket.IO connection for real-time communication
+## Environment
+This Space requires a HuggingFace token for model access. The token is automatically provided by HuggingFace Spaces when configured as a secret.
+## Technical Details
+- **Framework**: FastAPI with Socket.IO
+- **Models**:
+  - ASR: Whisper (English) and Wav2Vec2-BERT (African languages)
+  - Translation: NLLB-600M fine-tuned model
+  - TTS: VITS models for each language
+- **Optimization**: INT8 dynamic quantization via PyTorch

download_code.py ADDED Viewed

	@@ -0,0 +1,79 @@

+#!/usr/bin/env python3
+"""
+Download application code from private HuggingFace Space
+"""
+import os
+import sys
+from huggingface_hub import snapshot_download
+from pathlib import Path
+def download_code(code_space_id, token):
+    """
+    Download app and data from the private code space
+    Args:
+        code_space_id: Full space ID (e.g., "mutisya/polyglot-backend-code")
+        token: HuggingFace token for authentication
+    """
+    print(f"Downloading code from: {code_space_id}")
+    try:
+        # Download the entire space to a temporary directory
+        download_path = snapshot_download(
+            repo_id=code_space_id,
+            repo_type="space",
+            token=token,
+            local_dir="/tmp/code_download",
+            local_dir_use_symlinks=False
+        )
+        print(f"Code downloaded to: {download_path}")
+        # Move app and data to the correct locations
+        import shutil
+        # Move app directory
+        if Path("/tmp/code_download/app").exists():
+            if Path("/app/app").exists():
+                shutil.rmtree("/app/app")
+            shutil.move("/tmp/code_download/app", "/app/app")
+            print("OK app/ directory copied")
+        else:
+            print("WARNING: app/ directory not found in code space")
+        # Move data directory
+        if Path("/tmp/code_download/data").exists():
+            if Path("/app/data").exists():
+                shutil.rmtree("/app/data")
+            shutil.move("/tmp/code_download/data", "/app/data")
+            print("OK data/ directory copied")
+        else:
+            print("WARNING: data/ directory not found in code space")
+        # Clean up
+        if Path("/tmp/code_download").exists():
+            shutil.rmtree("/tmp/code_download")
+        print("OK Code download complete")
+        return True
+    except Exception as e:
+        print(f"ERROR downloading code: {e}")
+        import traceback
+        traceback.print_exc()
+        return False
+if __name__ == "__main__":
+    if len(sys.argv) < 2:
+        print("Usage: python download_code.py <code_space_id> [token]")
+        sys.exit(1)
+    code_space_id = sys.argv[1]
+    token = sys.argv[2] if len(sys.argv) > 2 else os.getenv("HUGGING_FACE_HUB_TOKEN")
+    if not token:
+        print("ERROR: No HuggingFace token provided")
+        sys.exit(1)
+    success = download_code(code_space_id, token)
+    sys.exit(0 if success else 1)