MyPod_Generator_6 / README.md
siddhartharyaai's picture
Update README.md
e2dc845 verified

A newer version of the Gradio SDK is available: 6.13.0

Upgrade
metadata
title: MyPod Generator
emoji: 😻
colorFrom: indigo
colorTo: pink
sdk: gradio
sdk_version: 5.8.0
app_file: app.py
pinned: false

Thought for a few seconds markdown Copy code

πŸŽ™οΈ MyPod - AI Based Podcast Generator

Welcome to MyPod, your go-to AI-powered podcast generator! πŸš€ Whether you have documents, webpages, YouTube videos, or topics you'd like to explore, MyPod transforms your content into engaging, conversational podcasts with ease.

🌟 Features

  • Multiple Input Sources:

    • Upload PDF: Convert your PDF documents into podcasts.
    • Enter URL: Transform the content of any webpage into a podcast.
    • YouTube Link (Requires User Auth - Work in Progress): Transcribe and convert YouTube videos into podcasts.
    • Research a Topic: Provide a detailed topic statement to generate a podcast based on researched information.
  • Customizable Output:

    • Tone Selection: Choose from Humorous, Formal, Casual, or Youthful tones to match your desired podcast style.
    • Duration Options: Select the length of your podcast ranging from 1-3 minutes to 10-20 minutes.
  • Automated Pronunciation Handling:

    • Abbreviation Splitting: Automatically splits abbreviations and concatenated words for accurate pronunciation.
  • Distinct Speaker Voices:

    • Jane & Emma: Enjoy a conversation between two distinct voicesβ€”Jane with a natural tone and Emma with a deeper, richer voice.
  • Transcript Generation:

    • Receive a markdown-formatted transcript alongside your podcast audio.

πŸ“¦ Installation

Follow these steps to set up and run MyPod on your local machine:

1. Clone the Repository

git clone https://github.com/yourusername/mypod.git
cd mypod
2. Create a Virtual Environment (Optional but Recommended)
bash
Copy code
python -m venv mypod_env
source mypod_env/bin/activate  # On Windows: mypod_env\Scripts\activate
3. Install Dependencies
Ensure you have Python 3.7 or higher installed. Then, install the required packages:

bash
Copy code
pip install -r requirements.txt
πŸš€ Usage
Launch the Gradio interface to start generating your podcasts:

bash
Copy code
python app.py
This will start a local web server and provide a URL (e.g., http://127.0.0.1:7860) where you can interact with MyPod.

πŸ“ How to Use
Choose Your Input Source:

Upload PDF: Click on "Upload PDF" and select your PDF document.
Enter URL: Input the URL of the webpage you want to convert into a podcast.
Enter YouTube Link: Provide the YouTube video URL (Note: Requires User Auth - Work in Progress).
Research a Topic: Enter a detailed topic statement. Be as specific as possible. If the topic is too niche or specific, the outcome may vary.
Select Tone and Duration:

Tone: Choose from Humorous, Formal, Casual, or Youthful.
Length: Select the desired duration range for your podcast.
Generate Podcast:

Click on the "Submit" button.
Wait for the processing to complete. YouTube transcriptions may take longer due to processing requirements.
Download Your Podcast:

Once generated, download the podcast audio and view the transcript.
πŸ“š Input Sources Explained
1. Upload PDF
Purpose: Convert the text content of a PDF document into an audio podcast.
Supported Formats: Only .pdf files are accepted.
2. Enter URL
Purpose: Extract and convert the textual content of any webpage into a podcast.
Supported Content: Most standard webpages with readable text content.
3. Enter YouTube Link (Requires User Auth - Work in Progress)
Purpose: Transcribe the audio from a YouTube video and convert it into a podcast.
Note: This feature is currently a work in progress and requires user authentication to access certain videos.
4. Research a Topic
Purpose: Generate a podcast based on researched information from reputable sources.
Recommendation: Provide a detailed and specific topic statement to achieve the best results. Extremely niche or highly specialized topics may yield less comprehensive podcasts.
🎨 Customization Options
Tone Selection:

Humorous: Funny and exciting, making listeners chuckle.
Formal: Business-like, well-structured, and professional.
Casual: Like a relaxed conversation between close friends.
Youthful: Energetic and lively, similar to how teenagers might chat.
Duration Selection:

1-3 Minutes: Approximately 200-450 words.
3-5 Minutes: Approximately 450-750 words.
5-10 Minutes: Approximately 750-1500 words.
10-20 Minutes: Approximately 1500-3000 words.
βš™οΈ Technical Details
Backend Technologies:

Gradio: For building the interactive web interface.
Groq API: For generating podcast scripts.
TTS (Text-to-Speech): For converting scripts into audio.
Whisper ASR: For transcribing YouTube videos.
Pydub: For audio manipulation and processing.
Performance Optimizations:

Regex Precompilation: Combined and precompiled regex patterns to speed up abbreviation splitting.
Efficient Processing: Optimized text preprocessing to minimize podcast generation time.
πŸ› οΈ Development
Repository Structure
scss
Copy code
mypod/
β”œβ”€β”€ app.py
β”œβ”€β”€ utils.py
β”œβ”€β”€ prompts.py
β”œβ”€β”€ requirements.txt
β”œβ”€β”€ README.md
└── ... (other files)
app.py: Contains the Gradio interface and main application logic.
utils.py: Handles text processing, audio generation, and other utility functions.
prompts.py: Defines the system prompt for the language model.
requirements.txt: Lists all Python dependencies.
Contributing
We welcome contributions to improve MyPod! Whether it's fixing bugs, enhancing features, or optimizing performance, your help is valuable.

Fork the Repository
Create a Feature Branch:
bash
Copy code
git checkout -b feature/YourFeature
Commit Your Changes:
bash
Copy code
git commit -m "Add some feature"
Push to the Branch:
bash
Copy code
git push origin feature/YourFeature
Open a Pull Request
Reporting Issues
If you encounter any bugs or have suggestions for improvements, please open an issue in the Issues section of the repository.

πŸ“„ License
This project is licensed under the MIT License.

🀝 Acknowledgements
Gradio: For making it easy to build machine learning demos.
OpenAI Whisper: For powerful speech recognition capabilities.
TTS Community: For developing robust text-to-speech models.
Various RSS Feeds and Wikipedia: For providing reliable information sources.
πŸŽ‰ Get Started!
πŸ”₯ Ready to create your personalized podcast? Give MyPod a try now and let the magic happen! πŸ”₯

Launch MyPod and start transforming your content into engaging podcasts today!

Happy Podcasting! 🎧✨