Spaces:

prasanacodes
/

Indic-Translation-Toolkit

Sleeping

App Files Files Community

prasanacodes commited on Aug 20, 2025

Commit

04a936d

verified ·

1 Parent(s): fa7dba3

Upload README.md

Browse files

Files changed (1) hide show

README.md +63 -0

README.md ADDED Viewed

	@@ -0,0 +1,63 @@

+---
+title: Audio/Video Translation Toolkit
+emoji: 🚀
+colorFrom: indigo
+colorTo: purple
+sdk: gradio
+python_version: 3.10.0
+app_file: app.py
+tags:
+  - translation
+  - audio
+  - video
+  - speech-synthesis
+  - voice-cloning
+  - gradio
+models:
+  - openai/whisper-large-v3
+  - JaesungHuh/voice-gender-classifier
+  - ai4bharat/IndicF5
+preload_from_hub:
+  - openai/whisper-large-v3
+  - JaesungHuh/voice-gender-classifier
+  - ai4bharat/IndicF5
+---
+# 🚀 Audio/Video Translation Toolkit
+This application provides a complete pipeline for translating the audio of a video or audio file from English to various Indian languages. It handles everything from vocal separation and transcription to translation, speech synthesis, and voice cloning.
+---
+## ## Key Features 🛠️
+* **🎬 Full Video Translation:** Upload a video, and the app will extract the audio, translate it, and merge it back into the original video.
+* **🎵 Full Audio Translation:** Translate standalone audio files.
+* **🗣️ Vocal Separation:** Isolate vocals from background music before processing.
+* **✍️ Transcription & Pace Detection:** Uses Whisper to transcribe the audio and determine the original speaker's pace.
+* **🌐 Multi-Lingual Translation:** Translate text to Tamil, Telugu, or Hindi using either local models or the Sarvam API.
+* **🔊 Speech Synthesis:** Generate new speech in the target language using models from `ai4bharat`.
+* **🧬 Voice Cloning:** Clone the voice from the original speaker onto the newly synthesized audio for a more natural result.
+---
+## ## How to Use the Main Pipeline
+1.  Navigate to the **Translate Video** or **Translate Audio** tab.
+2.  Upload your file.
+3.  Select the **Target Language**.
+4.  Choose the **Translation Via** method (`local` or `api`).
+5.  Click the **Translate** button and wait for the process to complete.
+---
+## ## ‼️ Important Setup Note for Duplication
+This Space relies on several local modules and data files that are not installed via `pip`. If you are duplicating this Space, you **must** manually upload the following directories to the root of your repository for the application to function correctly:
+* `gender/` (Contains the gender prediction model code)
+* `openvoice/` (Contains the voice cloning API and extractor code)
+* `checkpoints_v2/` (Contains the pre-trained model checkpoints for voice cloning)
+* `reference/` (Contains the reference audio and text files for speech synthesis)
+Without these directories, the application will fail with `ImportError` or `FileNotFoundError`.