BIAF-offASR / README.md
froster02's picture
fix: resolve HF Spaces deployment timeout
0c87f9b
metadata
title: BIAF-offASR
emoji: 🌾
colorFrom: green
colorTo: gray
sdk: docker
app_port: 7860
pinned: false

🌾 BIAF-offASR: Offline Translation Portal

CI/CD Pipeline Offline First Hardware Accelerated License Security Policy Python 3.10

A local-first, zero-network platform built to process translation, subtitling, and voice dubbing for agricultural and rural development. Supporting Hindi, Marathi, and English, it is designed for field officers working in areas with limited or no connectivity.


πŸ“Έ Aesthetic & Vision

The portal features a refined editorial-rural UIβ€”a unique blend of characterful serif typography (Alegreya) and high-legibility sans-serif (Hind), paired with a deep agricultural green palette. It moves away from generic "AI" aesthetics to provide a tactile, grounded experience for rural development professionals.

BIAF-offASR Preview


πŸ’‘ Key Capabilities

  1. Offline Text Translation: Instant translation between English, Hindi, and Marathi using optimized Seq2Seq models.
  2. Speech-to-Text (ASR): Transcribe audio/video with automated chunking and millisecond-accurate timestamps. Generates .srt and .vtt formats.
  3. Automated Video Dubbing: A complete pipeline that extracts audio, translates content, and burns subtitles directly into frames using FFmpeg.
  4. AI Voiceover Synthesis: Localized TTS in Hindi, Marathi, and English to replace original soundtracks with dubbed versions.
  5. Offline-First Status: Real-time monitoring of local model caches (Whisper, NLLB, MMS) ensures you know exactly when the system is ready for 100% offline use.

πŸ—οΈ System Architecture

The project is structured as a decoupled monorepo designed for both local development and containerized production.

graph TD
    User([User]) -->|Uploads Media| WebUI[React Frontend]
    WebUI -->|API Request| Backend[FastAPI Server]
    
    subgraph Local AI Models
        Backend -->|Inference| Whisper[Whisper ASR]
        Backend -->|Batch| NLLB[NLLB-200 Translator]
        Backend -->|Synthesis| MMS[MMS TTS Engine]
    end
    
    subgraph Media Pipeline
        NLLB -->|Subtitles| FFmpeg[FFmpeg Native]
        MMS -->|Voiceover| FFmpeg
        FFmpeg -->|Merge| FinalVideo[Dubbed & Subtitled Video]
    end
    
    FinalVideo -->|Download| WebUI

πŸš€ Performance & Hardening

  • ⚑ Vectorized Batching: 2.4x speedup in translation latency by processing segments in parallel.
  • πŸ”’ Thread Safety: Implemented RLock protections for stable multi-user inference on Apple Silicon (MPS).
  • πŸ›‘οΈ Repository Hygiene: Comprehensive .gitignore, .dockerignore, and .cursorignore patterns prevent ML weight bloat and security leaks.
  • πŸ”„ CI/CD Pipeline: Automated testing via GitHub Actions β€” runs backend unit tests, frontend linting/build, E2E Playwright tests, and code quality checks on every push and pull request.
  • βš–οΈ License Compliance: Licensed under AGPLv3 with built-in local-only warning banners for deployment transparency.

πŸ’» Installation & Setup

Requirements

  • Python: 3.8 - 3.11
  • Node.js: 18+
  • FFmpeg: Must be available on your system PATH.

Local Quickstart

🍎 macOS & Linux

chmod +x run.sh
./run.sh

πŸ”Œ Windows

start.bat

The portal will be accessible at http://localhost:8000.


πŸ§ͺ Verification

Verify your local setup by running the backend regression suite:

python backend/test_pipeline.py

This checks hardware acceleration, translation quality, and TTS audio synthesis.


πŸ§ͺ Testing & CI/CD

The project includes a robust testing pipeline to ensure stability and performance.

πŸ›‘οΈ Local Safety Net

A pre-push git hook is included to run basic checks locally before pushing code to GitHub.

  • Backend: Runs unit tests and pipeline validation in CI_MODE.
  • Frontend: Runs ESLint to maintain code quality.

To enable the pre-push hook (if not already active):

chmod +x .git/hooks/pre-push

πŸ€– CI Mock Mode

To avoid downloading large AI models (3GB+) in CI environments, the backend supports a CI_MODE. When CI_MODE=true is set:

  • Heavy ML inference is bypassed.
  • The system returns deterministic mock results for translation, transcription, and TTS.
  • Tests complete in seconds rather than minutes.

🎭 End-to-End (E2E) Tests

Comprehensive E2E tests are implemented using Playwright. These tests verify the entire application flow:

  1. Backend & Frontend initialization.
  2. Navigation and UI state changes.
  3. End-to-end translation requests (using mock results in CI).

Run E2E tests locally:

export CI_MODE=true
pytest testcase/e2e/test_app_flow.py

πŸ“„ Repository Governance

  • SECURITY.md: Vulnerability disclosure policy.
  • CODEOWNERS: Automated review assignments.
  • License: AGPLv3.