Spaces:
Sleeping
Sleeping
A newer version of the Gradio SDK is available:
6.5.1
metadata
license: mit
title: ποΈ PodXplain
sdk: gradio
emoji: π
colorFrom: red
colorTo: blue
pinned: true
short_description: PodXplain is a Hugging Face-hosted application that converts
ποΈ PodXplain
From script to story β voice it like never before.
PodXplain is a Hugging Face-hosted application that converts long-form text into engaging multi-speaker podcast-style audio. Simply input your script, and get a professional-sounding MP3 podcast with automatic speaker detection and assignment.
β¨ Features
- π Long-form Support: Handle up to 50,000 characters of text
- π Multi-speaker Audio: Automatic speaker detection and assignment
- π Smart Segmentation: Intelligent text splitting with progress tracking
- π΅ High-quality Output: MP3 format for optimal file size and compatibility
- π Real-time Progress: Live updates during generation
- π¨ Modern UI: Clean, intuitive Gradio interface
π οΈ Tech Stack
- Frontend: Gradio for interactive web interface
- TTS Engine: Nari DIA 1.6B for natural voice synthesis (currently mocked)
- Audio Processing: pydub for audio manipulation and MP3 conversion
- Hosting: Hugging Face Spaces with GPU support
π How to Use
- Input Text: Paste or type your podcast script (up to 50,000 characters)
- Choose Mode: Select speaker detection mode:
- Auto: Smart detection based on content structure
- Paragraph: Speaker changes at paragraph breaks
- Dialogue: Detection based on dialogue markers
- Generate: Click "Generate Podcast" and watch the progress
- Download: Get your MP3 file and listen to your podcast!
π Quick Start
Local Development
# Clone the repository
git clone [https://github.com/yourusername/podxplain.git](https://github.com/yourusername/podxplain.git) # Replace with your actual repo URL
cd podxplain
# Install dependencies
pip install -r requirements.txt
# Run the application
python app.py