Spaces:

arjun-ms
/

Subtrans

Sleeping

App Files Files Community

Subtrans / PRD.md

arjun-ms

Initial commit: Subtrans Subtitle Pipeline

57bbccb 10 days ago

preview code

raw

history blame contribute delete

4.08 kB

PRD.md — AI Subtitle Generator MVP

Goal

Build a simple web app where users can:

Upload a video
Generate English subtitles using AI speech-to-text
Translate subtitles into:
- Malayalam
- Tamil
- Hindi
Download .srt subtitle files

The MVP should be:

Extremely simple
Fast to build
Vibecoding-friendly
Localhost only

Core Features

1. Upload Video

Support:

.mp4
.mov
.mkv
.webm

2. Extract Audio

Use FFmpeg to extract audio from the uploaded video.

Example:

ffmpeg -i input.mp4 -ar 16000 -ac 1 output.wav

3. Speech to Text

Use local:

faster-whisper

Generate:

English transcript
English .srt
Timestamps

MVP Decision

The MVP will use local Faster-Whisper instead of cloud APIs.

Why?

Free
Fast enough for short videos
Better privacy
Works offline
Easy localhost setup
Easy to vibecode

Suggested Model

Start with:

base

Upgrade later if needed:

small
medium

Example

from faster_whisper import WhisperModel

model = WhisperModel("base")
segments, info = model.transcribe("audio.wav")

4. Translate Subtitles

Use a small translation adapter layer.

The app should NOT directly depend on one translation provider.

This makes it easy to:

start simple
swap providers later
experiment with better translation models

MVP Translation Provider

Start with:

deep-translator

Translate English subtitles into:

Malayalam (ml)
Tamil (ta)
Hindi (hi)

Future Translation Provider

Later we can swap in:

IndicTrans2
LibreTranslate
OpenAI models
Other local translation models

without changing the main application flow.

Suggested Adapter Design

services/
└── translators/
    ├── base.py
    ├── deep_translator_adapter.py
    └── indictrans_adapter.py

Example Interface

class Translator:
    def translate(self, text: str, target_lang: str) -> str:
        pass

Example MVP Usage

translator = DeepTranslatorAdapter()
translated = translator.translate(text, "ml")

5. Generate `.srt`

Generate downloadable subtitle files.

Example:

1
00:00:01,000 --> 00:00:03,000
Hello everyone

Tech Stack

Backend

FastAPI

Frontend

HTML
CSS
Minimal JavaScript
Jinja2 Templates

AI/Processing

Faster-Whisper
FFmpeg
deep-translator
pysrt

Simple Architecture

Upload Video
   ↓
Extract Audio
   ↓
Whisper Transcription
   ↓
Translate Text
   ↓
Generate .srt
   ↓
Download File

Suggested Folder Structure

app/
├── main.py
├── templates/
│   └── index.html
├── static/
│   └── styles.css
├── uploads/
├── subtitles/
└── services/
    ├── transcribe.py
    ├── translate.py
    └── srt_generator.py

Main UI

Single page with:

Upload input
Language dropdown
Generate button
Loading spinner
Download links

Main API

Generate Subtitles

POST /generate-subtitles

Inputs:

video file
target language

Outputs:

English .srt
Translated .srt

Suggested Dependencies

fastapi
uvicorn
jinja2
python-multipart
faster-whisper
ffmpeg-python
deep-translator
pysrt

Run Locally

uvicorn app.main:app --reload

MVP Rules

Keep everything in ONE FastAPI app
Store files locally
Use sync processing
No authentication
No database
No React
No Docker initially
No microservices
No overengineering

Build Order

Upload video
Extract audio
Generate English transcript
Generate English .srt
Add translation
Generate translated .srt
Improve UI later

Success Criteria

The MVP is successful if:

Video upload works
English subtitles are generated
Translation works
.srt download works
End-to-end pipeline works locally