Spaces:
Running
A newer version of the Gradio SDK is available:
6.8.0
title: SRT Processing Tool
emoji: π¬
colorFrom: blue
colorTo: purple
sdk: gradio
sdk_version: 6.5.1
app_file: app.py
pinned: false
license: mit
π¬ SRT Processing Tool
A production-ready web application for processing SRT subtitle files, powered by Gradio and ready for Hugging Face Spaces.
Resegment and translate your subtitle files easily in your browser!
β¨ Features
- π€ Audio to SRT: Transcribe audio files using NVIDIA Parakeet TDT
- π SRT Resegmentation: Optimize subtitle segments by character limits, respecting punctuation boundaries
- π SRT Translation: Translate subtitle files using AI (OpenAI, Aliyun DashScope, or OpenRouter)
- β‘ One-Stop Workflow: Transcribe, resegment, and translate in a single integrated process!
- π Production Ready: Optimized for Hugging Face Spaces deployment
π Live Demo
Try it live: https://huggingface.co/spaces/BiliSakura/SRT-Processing-Tool
This app is deployed on Hugging Face Spaces! To deploy your own version:
- Fork this repository
- Go to Hugging Face Spaces
- Create a new Space
- Connect your GitHub repository
- Select Gradio as the SDK
- Set the app file to
app.py - Add your API keys as secrets (see below)
- Deploy!
π API Keys Configuration
For translation features, add your API keys as secrets in Hugging Face Spaces:
- Go to your Space settings
- Navigate to "Variables and secrets"
- Add the following secrets:
Required Secrets (choose based on provider):
- Aliyun DashScope:
DASHSCOPE_API_KEY - OpenAI:
OPENAI_API_KEY - OpenRouter:
OPENROUTER_API_KEY
Optional Secrets (for OpenRouter attribution):
OPENROUTER_SITE_URL(maps toHTTP-Referer)OPENROUTER_APP_TITLE(maps toX-Title)
π¦ Local Installation
# Clone the repository
git clone https://huggingface.co/spaces/BiliSakura/SRT-Processing-Tool
cd SRT-Processing-Tool
# Create virtual environment
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
# Install dependencies
pip install -r requirements.txt
π Local Run
python app.py
The app will be available at http://localhost:7860
π Usage
- Open the app in your browser
- Select Input Type: SRT File or Audio File
- Upload your file
- Choose operation:
- Transcribe only (Audio only): Just transcribe audio to SRT
- Translate only: Translate subtitles to target language
- Resegment only: Optimize subtitle segments by character limits
- Configure settings:
- Translation Settings: Target language, provider, model, workers
- Resegmentation Settings: Maximum characters per segment
- Click "π Process File"
- Download your processed file!
π§ Configuration
ASR Model
- NVIDIA Parakeet TDT:
nvidia/parakeet-tdt-0.6b-v3(default)
Default Models
- OpenAI:
gpt-4.1(uses Responses API) - Aliyun DashScope:
qwen-max - OpenRouter:
openai/gpt-4o
Environment Variables
You can also use a .env file for local development:
# Aliyun DashScope
DASHSCOPE_API_KEY=your_key_here
# OpenAI
OPENAI_API_KEY=your_key_here
# OpenRouter
OPENROUTER_API_KEY=your_key_here
OPENROUTER_SITE_URL=https://your-site.com
OPENROUTER_APP_TITLE=Your App Title
# Optional: override model for all providers
MODEL=your_model_name
π» CLI Usage
You can also use the SRT processor from the command line:
# Resegment only
python tools/srt_processor.py input.srt output.srt --operation resegment --max-chars 125
# Translate (OpenAI)
python tools/srt_processor.py input.srt output.srt --operation translate --target-lang zh --provider openai --model gpt-4.1 --workers 5
# Translate (OpenRouter)
python tools/srt_processor.py input.srt output.srt --operation translate --target-lang zh --provider openrouter --model openai/gpt-4o --workers 5
# Translate (DashScope)
python tools/srt_processor.py input.srt output.srt --operation translate --target-lang zh --provider dashscope --model qwen-max --workers 5
ποΈ Project Structure
.
βββ app.py # Main Gradio application
βββ tools/
β βββ __init__.py
β βββ srt_processor.py # Core SRT processing logic
β βββ audio_transcriber.py # Audio transcription (NeMo ASR)
βββ requirements.txt # Python dependencies
βββ README.md # This file
π License
MIT License
π€ Contributing
Contributions are welcome! Please feel free to submit a Pull Request.
Made with β€οΈ for subtitle processing