SRT-Processing-Tool / README.md
BiliSakura's picture
Update README.md
572653d verified

A newer version of the Gradio SDK is available: 6.8.0

Upgrade
metadata
title: SRT Processing Tool
emoji: 🎬
colorFrom: blue
colorTo: purple
sdk: gradio
sdk_version: 6.5.1
app_file: app.py
pinned: false
license: mit

🎬 SRT Processing Tool

A production-ready web application for processing SRT subtitle files, powered by Gradio and ready for Hugging Face Spaces.

Resegment and translate your subtitle files easily in your browser!

✨ Features

  • 🎀 Audio to SRT: Transcribe audio files using NVIDIA Parakeet TDT
  • πŸ”„ SRT Resegmentation: Optimize subtitle segments by character limits, respecting punctuation boundaries
  • 🌍 SRT Translation: Translate subtitle files using AI (OpenAI, Aliyun DashScope, or OpenRouter)
  • ⚑ One-Stop Workflow: Transcribe, resegment, and translate in a single integrated process!
  • πŸš€ Production Ready: Optimized for Hugging Face Spaces deployment

πŸš€ Live Demo

Try it live: https://huggingface.co/spaces/BiliSakura/SRT-Processing-Tool

This app is deployed on Hugging Face Spaces! To deploy your own version:

  1. Fork this repository
  2. Go to Hugging Face Spaces
  3. Create a new Space
  4. Connect your GitHub repository
  5. Select Gradio as the SDK
  6. Set the app file to app.py
  7. Add your API keys as secrets (see below)
  8. Deploy!

πŸ”‘ API Keys Configuration

For translation features, add your API keys as secrets in Hugging Face Spaces:

  1. Go to your Space settings
  2. Navigate to "Variables and secrets"
  3. Add the following secrets:

Required Secrets (choose based on provider):

  • Aliyun DashScope: DASHSCOPE_API_KEY
  • OpenAI: OPENAI_API_KEY
  • OpenRouter: OPENROUTER_API_KEY

Optional Secrets (for OpenRouter attribution):

  • OPENROUTER_SITE_URL (maps to HTTP-Referer)
  • OPENROUTER_APP_TITLE (maps to X-Title)

πŸ“¦ Local Installation

# Clone the repository
git clone https://huggingface.co/spaces/BiliSakura/SRT-Processing-Tool
cd SRT-Processing-Tool

# Create virtual environment
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

# Install dependencies
pip install -r requirements.txt

πŸƒ Local Run

python app.py

The app will be available at http://localhost:7860

πŸ“– Usage

  1. Open the app in your browser
  2. Select Input Type: SRT File or Audio File
  3. Upload your file
  4. Choose operation:
    • Transcribe only (Audio only): Just transcribe audio to SRT
    • Translate only: Translate subtitles to target language
    • Resegment only: Optimize subtitle segments by character limits
  5. Configure settings:
    • Translation Settings: Target language, provider, model, workers
    • Resegmentation Settings: Maximum characters per segment
  6. Click "πŸš€ Process File"
  7. Download your processed file!

πŸ”§ Configuration

ASR Model

  • NVIDIA Parakeet TDT: nvidia/parakeet-tdt-0.6b-v3 (default)

Default Models

  • OpenAI: gpt-4.1 (uses Responses API)
  • Aliyun DashScope: qwen-max
  • OpenRouter: openai/gpt-4o

Environment Variables

You can also use a .env file for local development:

# Aliyun DashScope
DASHSCOPE_API_KEY=your_key_here

# OpenAI
OPENAI_API_KEY=your_key_here

# OpenRouter
OPENROUTER_API_KEY=your_key_here
OPENROUTER_SITE_URL=https://your-site.com
OPENROUTER_APP_TITLE=Your App Title

# Optional: override model for all providers
MODEL=your_model_name

πŸ’» CLI Usage

You can also use the SRT processor from the command line:

# Resegment only
python tools/srt_processor.py input.srt output.srt --operation resegment --max-chars 125

# Translate (OpenAI)
python tools/srt_processor.py input.srt output.srt --operation translate --target-lang zh --provider openai --model gpt-4.1 --workers 5

# Translate (OpenRouter)
python tools/srt_processor.py input.srt output.srt --operation translate --target-lang zh --provider openrouter --model openai/gpt-4o --workers 5

# Translate (DashScope)
python tools/srt_processor.py input.srt output.srt --operation translate --target-lang zh --provider dashscope --model qwen-max --workers 5

πŸ—οΈ Project Structure

.
β”œβ”€β”€ app.py                 # Main Gradio application
β”œβ”€β”€ tools/
β”‚   β”œβ”€β”€ __init__.py
β”‚   β”œβ”€β”€ srt_processor.py   # Core SRT processing logic
β”‚   └── audio_transcriber.py # Audio transcription (NeMo ASR)
β”œβ”€β”€ requirements.txt       # Python dependencies
└── README.md             # This file

πŸ“ License

MIT License

🀝 Contributing

Contributions are welcome! Please feel free to submit a Pull Request.


Made with ❀️ for subtitle processing