--- title: SRT Processing Tool emoji: 🎬 colorFrom: blue colorTo: purple sdk: gradio sdk_version: 6.5.1 app_file: app.py pinned: false license: mit --- # 🎬 SRT Processing Tool A production-ready web application for processing SRT subtitle files, powered by Gradio and ready for Hugging Face Spaces. **Resegment and translate your subtitle files easily in your browser!** ## ✨ Features - **🎤 Audio to SRT**: Transcribe audio files using NVIDIA Parakeet TDT - **🔄 SRT Resegmentation**: Optimize subtitle segments by character limits, respecting punctuation boundaries - **🌍 SRT Translation**: Translate subtitle files using AI (OpenAI, Aliyun DashScope, or OpenRouter) - **⚡ One-Stop Workflow**: Transcribe, resegment, and translate in a single integrated process! - **🚀 Production Ready**: Optimized for Hugging Face Spaces deployment ## 🚀 Live Demo **Try it live:** [https://huggingface.co/spaces/BiliSakura/SRT-Processing-Tool](https://huggingface.co/spaces/BiliSakura/SRT-Processing-Tool) This app is deployed on Hugging Face Spaces! To deploy your own version: 1. Fork this repository 2. Go to [Hugging Face Spaces](https://huggingface.co/spaces) 3. Create a new Space 4. Connect your GitHub repository 5. Select Gradio as the SDK 6. Set the app file to `app.py` 7. Add your API keys as secrets (see below) 8. Deploy! ## 🔑 API Keys Configuration For translation features, add your API keys as secrets in Hugging Face Spaces: 1. Go to your Space settings 2. Navigate to "Variables and secrets" 3. Add the following secrets: ### Required Secrets (choose based on provider): - **Aliyun DashScope**: `DASHSCOPE_API_KEY` - **OpenAI**: `OPENAI_API_KEY` - **OpenRouter**: `OPENROUTER_API_KEY` ### Optional Secrets (for OpenRouter attribution): - `OPENROUTER_SITE_URL` (maps to `HTTP-Referer`) - `OPENROUTER_APP_TITLE` (maps to `X-Title`) ## 📦 Local Installation ```bash # Clone the repository git clone https://huggingface.co/spaces/BiliSakura/SRT-Processing-Tool cd SRT-Processing-Tool # Create virtual environment python -m venv venv source venv/bin/activate # On Windows: venv\Scripts\activate # Install dependencies pip install -r requirements.txt ``` ## 🏃 Local Run ```bash python app.py ``` The app will be available at `http://localhost:7860` ## 📖 Usage 1. Open the app in your browser 2. Select Input Type: **SRT File** or **Audio File** 3. Upload your file 4. Choose operation: - **Transcribe only** (Audio only): Just transcribe audio to SRT - **Translate only**: Translate subtitles to target language - **Resegment only**: Optimize subtitle segments by character limits 5. Configure settings: - **Translation Settings**: Target language, provider, model, workers - **Resegmentation Settings**: Maximum characters per segment 6. Click "🚀 Process File" 7. Download your processed file! ## 🔧 Configuration ### ASR Model - **NVIDIA Parakeet TDT**: `nvidia/parakeet-tdt-0.6b-v3` (default) ### Default Models - **OpenAI**: `gpt-4.1` (uses Responses API) - **Aliyun DashScope**: `qwen-max` - **OpenRouter**: `openai/gpt-4o` ### Environment Variables You can also use a `.env` file for local development: ```env # Aliyun DashScope DASHSCOPE_API_KEY=your_key_here # OpenAI OPENAI_API_KEY=your_key_here # OpenRouter OPENROUTER_API_KEY=your_key_here OPENROUTER_SITE_URL=https://your-site.com OPENROUTER_APP_TITLE=Your App Title # Optional: override model for all providers MODEL=your_model_name ``` ## 💻 CLI Usage You can also use the SRT processor from the command line: ```bash # Resegment only python tools/srt_processor.py input.srt output.srt --operation resegment --max-chars 125 # Translate (OpenAI) python tools/srt_processor.py input.srt output.srt --operation translate --target-lang zh --provider openai --model gpt-4.1 --workers 5 # Translate (OpenRouter) python tools/srt_processor.py input.srt output.srt --operation translate --target-lang zh --provider openrouter --model openai/gpt-4o --workers 5 # Translate (DashScope) python tools/srt_processor.py input.srt output.srt --operation translate --target-lang zh --provider dashscope --model qwen-max --workers 5 ``` ## 🏗️ Project Structure ``` . ├── app.py # Main Gradio application ├── tools/ │ ├── __init__.py │ ├── srt_processor.py # Core SRT processing logic │ └── audio_transcriber.py # Audio transcription (NeMo ASR) ├── requirements.txt # Python dependencies └── README.md # This file ``` ## 📝 License MIT License ## 🤝 Contributing Contributions are welcome! Please feel free to submit a Pull Request. --- **Made with ❤️ for subtitle processing**