π€ Subtitle Generator
Automatically generate subtitles from audio/video files using OpenAI's Whisper model.
Features
- ποΈ Speech-to-Text: Transcribe audio using Whisper
- π Auto Language Detection: Automatically detects source language
- π Word-Level Timing: Shows timing for each individual word
- π SRT Export: Download subtitles in standard SRT format
- π¬ Video Support: Extract audio from video files (MP4, MOV, MKV, AVI)
- π Story Mode: Merge short sentences into longer segments
How to Use
Upload an audio or video file
- Audio: MP3, WAV, OGG, M4A, FLAC, AAC
- Video: MP4, MOV, MKV, AVI
Select Whisper model size
tiny: Fastest, least accuratebase: Balanced (recommended)small: More accurate, slowermedium: Most accurate, slowest
Choose processing mode
- Podcast: Keep original sentence structure
- Story: Merge short sentences (max 300 chars, 12 sec)
Click "Generate Subtitles" and wait for processing
Download the SRT file
Model Information
| Model | Speed | Accuracy | Use Case |
|---|---|---|---|
| tiny | Fastest | Low | Quick previews |
| base | Fast | Good | General use (recommended) |
| small | Medium | Better | Podcasts, lectures |
| medium | Slow | Best | Professional transcription |
Technical Details
- Uses faster-whisper for efficient CPU inference
- VAD filter removes silence/speech detection
- Word timestamps for precise alignment
- Auto language detection for multilingual support
- Smart number formatting (e.g., "1030" β "10:30" in time context)
Notes
- First transcription will download the model (~150MB for base model)
- Processing time depends on file length and model size
- Videos are processed by extracting audio first
- All processing happens on CPU (no GPU required)
Inference Providers NEW
This model isn't deployed by any Inference Provider. π Ask for provider support