Spaces:

siddhartharyaai
/

MyPod_Generator_6

Sleeping

App Files Files Community

siddhartharyaai commited on Dec 9, 2024

Commit

e2dc845

verified ·

1 Parent(s): bd4988a

Update README.md

Browse files

Files changed (1) hide show

README.md +128 -91

README.md CHANGED Viewed

@@ -9,128 +9,165 @@ app_file: app.py
 pinned: false
 ---
-# 🎙 MyPod - AI Based Podcast Generator
-Welcome to **MyPod**, your go-to AI-powered podcast generator! 🎉
-**MyPod** transforms your documents, webpages, YouTube videos, or researched topics into a more human-sounding, conversational podcast. Select a tone and a duration range, and let **MyPod** generate an engaging podcast tailored to your preferences.
-## 🚀 **Features**
-1. **Multiple Input Sources:**
-   - **Upload PDF:** Convert your PDF documents into podcasts.
-   - **Enter URL:** Convert the content of any webpage into a podcast.
-   - **YouTube Link:** Transcribe and convert YouTube videos into podcasts *(Requires User Authentication - Work in Progress)*.
-   - **Research a Topic:** Provide a detailed topic, and **MyPod** will research and generate a podcast based on the latest information.
-2. **Customizable Tone and Length:**
-   - **Tone Options:** Choose from Humorous, Formal, Casual, or Youthful to set the desired tone of your podcast.
-   - **Duration Range:** Select from 1-3 Minutes, 3-5 Minutes, 5-10 Minutes, or 10-20 Minutes to specify the length of your podcast.
-3. **Enhanced Research Capability:**
-   - Utilizes Wikipedia and various News RSS feeds to gather relevant information.
-   - Implements a fallback mechanism to leverage the LLM's (Groq API) knowledge base if primary sources do not provide sufficient information.
-4. **Natural-Sounding Voices:**
-   - **Jane and Emma:** Two distinct voices are used to create a natural and engaging conversational flow.
-   - **Optimized Parameters:** Adjusted speed and pitch settings to enhance the naturalness without compromising clarity.
-5. **Engaging Podcast Scripts:**
-   - Dynamic and varied introductions and dialogues.
-   - Incorporates storytelling techniques, analogies, and thought-provoking questions to captivate listeners.
-## 📦 **Installation**
-To set up and run **MyPod** locally, follow these steps:
-1. **Clone the Repository:**
-   ```bash
-   git clone https://github.com/yourusername/mypod.git
-   cd mypod
-Create a Virtual Environment:
 bash
 Copy code
-python3 -m venv venv
-source venv/bin/activate  # On Windows: venv\Scripts\activate
-Install Dependencies:
 bash
 Copy code
 pip install -r requirements.txt
-Set Up Environment Variables:
-Ensure you have a valid Groq API key. Set it as an environment variable:
-bash
-Copy code
-export GROQ_API_KEY='your_groq_api_key_here'  # On Windows: set GROQ_API_KEY=your_groq_api_key_here
-Run the Application:
 bash
 Copy code
 python app.py
-The Gradio interface will launch, and you can access MyPod via the provided URL.
-🛠 Usage Instructions
-⏳ Please be patient while your podcast is being generated.
-This process involves content analysis, script creation, and high-quality audio synthesis, which may take a few minutes.
-Provide an Input Source:
-Upload PDF: Click to upload a PDF file from your device.
-Enter URL: Input the URL of a webpage you wish to convert into a podcast.
-YouTube Link: (Work in Progress) Enter a YouTube video URL to transcribe and convert into a podcast.
-Research a Topic: Enter a detailed topic to research and convert into a podcast.
 Select Tone and Duration:
 Tone: Choose from Humorous, Formal, Casual, or Youthful.
 Length: Select the desired duration range for your podcast.
 Generate Podcast:
-Click the "Submit" button to generate your podcast.
-The generated podcast will be available for download, along with the transcript.
-📈 Technical Details
 Backend Technologies:
-Groq API: Utilized for generating engaging podcast scripts based on input texts.
-TTS (Text-to-Speech): Uses the tts_models/en/ljspeech/glow-tts model for generating speech.
-Whisper ASR: Implements OpenAI's Whisper model for transcribing YouTube videos.
-Web Scraping: Employs BeautifulSoup to extract text from webpages and RSS feeds.
-Audio Processing: Uses Pydub for audio manipulation and processing.
-Key Enhancements:
-Research Sufficiency Check: Ensures that the information gathered is comprehensive enough to generate a meaningful podcast. If primary sources are insufficient, it leverages the LLM's knowledge base to supplement the data.
-Dynamic System Prompt: Guides the LLM to create more engaging and dynamic podcast scripts, avoiding repetitive introductions and ensuring a natural conversational flow.
-Optimized TTS Parameters: Adjusted speed and pitch settings to make the voices sound more natural and human-like without introducing artificial pauses or breaks.
-🧪 Testing and Feedback
-Generate a Podcast:
-Provide a detailed input (e.g., a comprehensive topic or a well-structured PDF).
-Select your desired tone and duration.
-Submit to generate the podcast.
-Assess Output Quality:
-Transcript: Review the transcript for accuracy and engagement.
-Audio Quality: Listen to the podcast to evaluate the naturalness of the voices and the overall flow of the conversation.
-Iterate on TTS Settings:
-If the voices still sound synthetic, consider experimenting with different speed_factor and semitones values in the utils.py file under the generate_audio_wav function.
-Example adjustments:
-Test Case 1: semitones = -2, speed_factor = 1.2
-Test Case 2: semitones = -3, speed_factor = 1.25 (Current Settings)
-Test Case 3: semitones = -4, speed_factor = 1.3
-Provide Feedback:
-Share your experiences and any suggestions for further improvements to help enhance MyPod.
-📝 Contributing
-Contributions are welcome! Please fork the repository and submit a pull request with your enhancements or bug fixes.
-📜 License
 This project is licensed under the MIT License.
-📧 Contact
-For any questions or support, please contact siddhartharya@gmail.com.
-Launch MyPod and start transforming your content into engaging podcasts today!

 pinned: false
 ---
+Thought for a few seconds
+markdown
+Copy code
+# 🎙️ MyPod - AI Based Podcast Generator
+Welcome to **MyPod**, your go-to AI-powered podcast generator! 🚀 Whether you have documents, webpages, YouTube videos, or topics you'd like to explore, MyPod transforms your content into engaging, conversational podcasts with ease.
+## 🌟 Features
+- **Multiple Input Sources:**
+  - **Upload PDF:** Convert your PDF documents into podcasts.
+  - **Enter URL:** Transform the content of any webpage into a podcast.
+  - **YouTube Link (Requires User Auth - Work in Progress):** Transcribe and convert YouTube videos into podcasts.
+  - **Research a Topic:** Provide a detailed topic statement to generate a podcast based on researched information.
+- **Customizable Output:**
+  - **Tone Selection:** Choose from Humorous, Formal, Casual, or Youthful tones to match your desired podcast style.
+  - **Duration Options:** Select the length of your podcast ranging from 1-3 minutes to 10-20 minutes.
+- **Automated Pronunciation Handling:**
+  - **Abbreviation Splitting:** Automatically splits abbreviations and concatenated words for accurate pronunciation.
+- **Distinct Speaker Voices:**
+  - **Jane & Emma:** Enjoy a conversation between two distinct voices—Jane with a natural tone and Emma with a deeper, richer voice.
+- **Transcript Generation:**
+  - Receive a markdown-formatted transcript alongside your podcast audio.
+## 📦 Installation
+Follow these steps to set up and run MyPod on your local machine:
+### 1. Clone the Repository
+```bash
+git clone https://github.com/yourusername/mypod.git
+cd mypod
+2. Create a Virtual Environment (Optional but Recommended)
 bash
 Copy code
+python -m venv mypod_env
+source mypod_env/bin/activate  # On Windows: mypod_env\Scripts\activate
+3. Install Dependencies
+Ensure you have Python 3.7 or higher installed. Then, install the required packages:
 bash
 Copy code
 pip install -r requirements.txt
+🚀 Usage
+Launch the Gradio interface to start generating your podcasts:
 bash
 Copy code
 python app.py
+This will start a local web server and provide a URL (e.g., http://127.0.0.1:7860) where you can interact with MyPod.
+📝 How to Use
+Choose Your Input Source:
+Upload PDF: Click on "Upload PDF" and select your PDF document.
+Enter URL: Input the URL of the webpage you want to convert into a podcast.
+Enter YouTube Link: Provide the YouTube video URL (Note: Requires User Auth - Work in Progress).
+Research a Topic: Enter a detailed topic statement. Be as specific as possible. If the topic is too niche or specific, the outcome may vary.
 Select Tone and Duration:
 Tone: Choose from Humorous, Formal, Casual, or Youthful.
 Length: Select the desired duration range for your podcast.
 Generate Podcast:
+Click on the "Submit" button.
+Wait for the processing to complete. YouTube transcriptions may take longer due to processing requirements.
+Download Your Podcast:
+Once generated, download the podcast audio and view the transcript.
+📚 Input Sources Explained
+1. Upload PDF
+Purpose: Convert the text content of a PDF document into an audio podcast.
+Supported Formats: Only .pdf files are accepted.
+2. Enter URL
+Purpose: Extract and convert the textual content of any webpage into a podcast.
+Supported Content: Most standard webpages with readable text content.
+3. Enter YouTube Link (Requires User Auth - Work in Progress)
+Purpose: Transcribe the audio from a YouTube video and convert it into a podcast.
+Note: This feature is currently a work in progress and requires user authentication to access certain videos.
+4. Research a Topic
+Purpose: Generate a podcast based on researched information from reputable sources.
+Recommendation: Provide a detailed and specific topic statement to achieve the best results. Extremely niche or highly specialized topics may yield less comprehensive podcasts.
+🎨 Customization Options
+Tone Selection:
+Humorous: Funny and exciting, making listeners chuckle.
+Formal: Business-like, well-structured, and professional.
+Casual: Like a relaxed conversation between close friends.
+Youthful: Energetic and lively, similar to how teenagers might chat.
+Duration Selection:
+1-3 Minutes: Approximately 200-450 words.
+3-5 Minutes: Approximately 450-750 words.
+5-10 Minutes: Approximately 750-1500 words.
+10-20 Minutes: Approximately 1500-3000 words.
+⚙️ Technical Details
 Backend Technologies:
+Gradio: For building the interactive web interface.
+Groq API: For generating podcast scripts.
+TTS (Text-to-Speech): For converting scripts into audio.
+Whisper ASR: For transcribing YouTube videos.
+Pydub: For audio manipulation and processing.
+Performance Optimizations:
+Regex Precompilation: Combined and precompiled regex patterns to speed up abbreviation splitting.
+Efficient Processing: Optimized text preprocessing to minimize podcast generation time.
+🛠️ Development
+Repository Structure
+scss
+Copy code
+mypod/
+├── app.py
+├── utils.py
+├── prompts.py
+├── requirements.txt
+├── README.md
+└── ... (other files)
+app.py: Contains the Gradio interface and main application logic.
+utils.py: Handles text processing, audio generation, and other utility functions.
+prompts.py: Defines the system prompt for the language model.
+requirements.txt: Lists all Python dependencies.
+Contributing
+We welcome contributions to improve MyPod! Whether it's fixing bugs, enhancing features, or optimizing performance, your help is valuable.
+Fork the Repository
+Create a Feature Branch:
+bash
+Copy code
+git checkout -b feature/YourFeature
+Commit Your Changes:
+bash
+Copy code
+git commit -m "Add some feature"
+Push to the Branch:
+bash
+Copy code
+git push origin feature/YourFeature
+Open a Pull Request
+Reporting Issues
+If you encounter any bugs or have suggestions for improvements, please open an issue in the Issues section of the repository.
+📄 License
 This project is licensed under the MIT License.
+🤝 Acknowledgements
+Gradio: For making it easy to build machine learning demos.
+OpenAI Whisper: For powerful speech recognition capabilities.
+TTS Community: For developing robust text-to-speech models.
+Various RSS Feeds and Wikipedia: For providing reliable information sources.
+🎉 Get Started!
+🔥 Ready to create your personalized podcast? Give MyPod a try now and let the magic happen! 🔥
+Launch MyPod and start transforming your content into engaging podcasts today!
+Happy Podcasting! 🎧✨