Spaces:
Sleeping
title: BIAF-offASR
emoji: πΎ
colorFrom: green
colorTo: gray
sdk: docker
app_port: 7860
pinned: false
πΎ BIAF-offASR: Offline Translation Portal
A local-first, zero-network platform built to process translation, subtitling, and voice dubbing for agricultural and rural development. Supporting Hindi, Marathi, and English, it is designed for field officers working in areas with limited or no connectivity.
πΈ Aesthetic & Vision
The portal features a refined editorial-rural UIβa unique blend of characterful serif typography (Alegreya) and high-legibility sans-serif (Hind), paired with a deep agricultural green palette. It moves away from generic "AI" aesthetics to provide a tactile, grounded experience for rural development professionals.
π‘ Key Capabilities
- Offline Text Translation: Instant translation between English, Hindi, and Marathi using optimized Seq2Seq models.
- Speech-to-Text (ASR): Transcribe audio/video with automated chunking and millisecond-accurate timestamps. Generates
.srtand.vttformats. - Automated Video Dubbing: A complete pipeline that extracts audio, translates content, and burns subtitles directly into frames using FFmpeg.
- AI Voiceover Synthesis: Localized TTS in Hindi, Marathi, and English to replace original soundtracks with dubbed versions.
- Offline-First Status: Real-time monitoring of local model caches (Whisper, NLLB, MMS) ensures you know exactly when the system is ready for 100% offline use.
ποΈ System Architecture
The project is structured as a decoupled monorepo designed for both local development and containerized production.
- Frontend (React 19 + Vite 8): High-fidelity UI with glassmorphic cards, live processing feedback, and an interactive subtitle editor.
- Backend (FastAPI + PyTorch): High-throughput engine managing ML inference, vectorized batching, and system-level FFmpeg tasks.
graph TD
User([User]) -->|Uploads Media| WebUI[React Frontend]
WebUI -->|API Request| Backend[FastAPI Server]
subgraph Local AI Models
Backend -->|Inference| Whisper[Whisper ASR]
Backend -->|Batch| NLLB[NLLB-200 Translator]
Backend -->|Synthesis| MMS[MMS TTS Engine]
end
subgraph Media Pipeline
NLLB -->|Subtitles| FFmpeg[FFmpeg Native]
MMS -->|Voiceover| FFmpeg
FFmpeg -->|Merge| FinalVideo[Dubbed & Subtitled Video]
end
FinalVideo -->|Download| WebUI
π Performance & Hardening
- β‘ Vectorized Batching: 2.4x speedup in translation latency by processing segments in parallel.
- π Thread Safety: Implemented
RLockprotections for stable multi-user inference on Apple Silicon (MPS). - π‘οΈ Repository Hygiene: Comprehensive
.gitignore,.dockerignore, and.cursorignorepatterns prevent ML weight bloat and security leaks. - π CI/CD Pipeline: Automated testing via GitHub Actions β runs backend unit tests, frontend linting/build, E2E Playwright tests, and code quality checks on every push and pull request.
- βοΈ License Compliance: Licensed under AGPLv3 with built-in local-only warning banners for deployment transparency.
π» Installation & Setup
Requirements
- Python: 3.8 - 3.11
- Node.js: 18+
- FFmpeg: Must be available on your system
PATH.
Local Quickstart
π macOS & Linux
chmod +x run.sh
./run.sh
π Windows
start.bat
The portal will be accessible at http://localhost:8000.
π§ͺ Verification
Verify your local setup by running the backend regression suite:
python backend/test_pipeline.py
This checks hardware acceleration, translation quality, and TTS audio synthesis.
π§ͺ Testing & CI/CD
The project includes a robust testing pipeline to ensure stability and performance.
π‘οΈ Local Safety Net
A pre-push git hook is included to run basic checks locally before pushing code to GitHub.
- Backend: Runs unit tests and pipeline validation in
CI_MODE. - Frontend: Runs ESLint to maintain code quality.
To enable the pre-push hook (if not already active):
chmod +x .git/hooks/pre-push
π€ CI Mock Mode
To avoid downloading large AI models (3GB+) in CI environments, the backend supports a CI_MODE. When CI_MODE=true is set:
- Heavy ML inference is bypassed.
- The system returns deterministic mock results for translation, transcription, and TTS.
- Tests complete in seconds rather than minutes.
π End-to-End (E2E) Tests
Comprehensive E2E tests are implemented using Playwright. These tests verify the entire application flow:
- Backend & Frontend initialization.
- Navigation and UI state changes.
- End-to-end translation requests (using mock results in CI).
Run E2E tests locally:
export CI_MODE=true
pytest testcase/e2e/test_app_flow.py
π Repository Governance
- SECURITY.md: Vulnerability disclosure policy.
- CODEOWNERS: Automated review assignments.
- License: AGPLv3.
