Spaces:

BiliSakura
/

SRT-Processing-Tool

Running

App Files Files Community

SRT-Processing-Tool / README.md

BiliSakura

Update README.md

572653d verified about 1 month ago

preview code

raw

history blame contribute delete

4.73 kB

A newer version of the Gradio SDK is available: 6.8.0

Upgrade

metadata

title: SRT Processing Tool
emoji: 🎬
colorFrom: blue
colorTo: purple
sdk: gradio
sdk_version: 6.5.1
app_file: app.py
pinned: false
license: mit

🎬 SRT Processing Tool

A production-ready web application for processing SRT subtitle files, powered by Gradio and ready for Hugging Face Spaces.

Resegment and translate your subtitle files easily in your browser!

✨ Features

🎤 Audio to SRT: Transcribe audio files using NVIDIA Parakeet TDT
🔄 SRT Resegmentation: Optimize subtitle segments by character limits, respecting punctuation boundaries
🌍 SRT Translation: Translate subtitle files using AI (OpenAI, Aliyun DashScope, or OpenRouter)
⚡ One-Stop Workflow: Transcribe, resegment, and translate in a single integrated process!
🚀 Production Ready: Optimized for Hugging Face Spaces deployment

🚀 Live Demo

Try it live: https://huggingface.co/spaces/BiliSakura/SRT-Processing-Tool

This app is deployed on Hugging Face Spaces! To deploy your own version:

Fork this repository
Go to Hugging Face Spaces
Create a new Space
Connect your GitHub repository
Select Gradio as the SDK
Set the app file to app.py
Add your API keys as secrets (see below)
Deploy!

🔑 API Keys Configuration

For translation features, add your API keys as secrets in Hugging Face Spaces:

Go to your Space settings
Navigate to "Variables and secrets"
Add the following secrets:

Required Secrets (choose based on provider):

Aliyun DashScope: DASHSCOPE_API_KEY
OpenAI: OPENAI_API_KEY
OpenRouter: OPENROUTER_API_KEY

Optional Secrets (for OpenRouter attribution):

OPENROUTER_SITE_URL (maps to HTTP-Referer)
OPENROUTER_APP_TITLE (maps to X-Title)

📦 Local Installation

# Clone the repository
git clone https://huggingface.co/spaces/BiliSakura/SRT-Processing-Tool
cd SRT-Processing-Tool

# Create virtual environment
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

# Install dependencies
pip install -r requirements.txt

🏃 Local Run

python app.py

The app will be available at http://localhost:7860

📖 Usage

Open the app in your browser
Select Input Type: SRT File or Audio File
Upload your file
Choose operation:
- Transcribe only (Audio only): Just transcribe audio to SRT
- Translate only: Translate subtitles to target language
- Resegment only: Optimize subtitle segments by character limits
Configure settings:
- Translation Settings: Target language, provider, model, workers
- Resegmentation Settings: Maximum characters per segment
Click "🚀 Process File"
Download your processed file!

🔧 Configuration

ASR Model

NVIDIA Parakeet TDT: nvidia/parakeet-tdt-0.6b-v3 (default)

Default Models

OpenAI: gpt-4.1 (uses Responses API)
Aliyun DashScope: qwen-max
OpenRouter: openai/gpt-4o

Environment Variables

You can also use a .env file for local development:

# Aliyun DashScope
DASHSCOPE_API_KEY=your_key_here

# OpenAI
OPENAI_API_KEY=your_key_here

# OpenRouter
OPENROUTER_API_KEY=your_key_here
OPENROUTER_SITE_URL=https://your-site.com
OPENROUTER_APP_TITLE=Your App Title

# Optional: override model for all providers
MODEL=your_model_name

💻 CLI Usage

You can also use the SRT processor from the command line:

# Resegment only
python tools/srt_processor.py input.srt output.srt --operation resegment --max-chars 125

# Translate (OpenAI)
python tools/srt_processor.py input.srt output.srt --operation translate --target-lang zh --provider openai --model gpt-4.1 --workers 5

# Translate (OpenRouter)
python tools/srt_processor.py input.srt output.srt --operation translate --target-lang zh --provider openrouter --model openai/gpt-4o --workers 5

# Translate (DashScope)
python tools/srt_processor.py input.srt output.srt --operation translate --target-lang zh --provider dashscope --model qwen-max --workers 5

🏗️ Project Structure

.
├── app.py                 # Main Gradio application
├── tools/
│   ├── __init__.py
│   ├── srt_processor.py   # Core SRT processing logic
│   └── audio_transcriber.py # Audio transcription (NeMo ASR)
├── requirements.txt       # Python dependencies
└── README.md             # This file

📝 License

MIT License

🤝 Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

Made with ❤️ for subtitle processing