A newer version of the Streamlit SDK is available:
1.55.0
title: AI Powered YouTube Transcript Tutor
emoji: π
colorFrom: red
colorTo: red
sdk: streamlit
app_file: src/streamlit_app.py
app_port: 8501
tags:
- streamlit
pinned: false
short_description: Streamlit template space
license: mit
π AI-Powered YouTube Transcript Tutor
A sophisticated Streamlit application that transforms YouTube videos into interactive learning experiences using AI. Ask questions about video content and get intelligent answers based on the transcript.
π Live Demo
Try the app now: https://ai-powered-youtube-transcript-tutor.streamlit.app/
Experience the full functionality without any setup required!
π Features
Core Functionality
- YouTube Transcript Extraction: Automatically extracts transcripts from YouTube videos
- AI-Powered Q&A: Ask questions about video content and get intelligent responses
- Multi-language Support: Supports transcripts in multiple languages
- Video Metadata Display: Shows video information including title, author, duration, and views
Enhanced UI/UX
- Modern Dark Theme: Clean, professional interface with dark theme
- Responsive Layout: Works seamlessly on desktop and mobile devices
- Loading Indicators: Visual feedback during processing
- Sidebar Navigation: Easy access to processed videos and settings
- Progress Bars: Real-time processing status updates
Advanced Features
- Multiple Video Processing: Handle multiple videos in a single session
- Chat History: Persistent conversation history with export options
- Export Functionality: Export Q&A sessions as PDF, text, or JSON
- Transcript Download: Download video transcripts for offline use
- Fallback System: Works even when OpenAI API quota is exceeded
- Session Management: Advanced session state management
π Quick Start
π‘ Want to try it first? Check out the live demo - no installation required!
Prerequisites
- Python 3.8 or higher
- OpenAI API key
Installation
Clone the repository
git clone https://github.com/midlaj-muhammed/AI-Powered-YouTube-Transcript-Tutor.git cd AI-Powered-YouTube-Transcript-TutorCreate virtual environment
python -m venv venv source venv/bin/activate # On Windows: venv\Scripts\activateInstall dependencies
pip install -r requirements.txtSet up environment variables
# Create .env file echo "OPENAI_API_KEY=your_openai_api_key_here" > .envRun the application
streamlit run app.py
π§ Configuration
Environment Variables
OPENAI_API_KEY: Your OpenAI API key for AI-powered responses
Streamlit Configuration
The app includes custom Streamlit configuration in .streamlit/config.toml for optimal performance.
π± Usage
- Enter YouTube URL: Paste any YouTube video URL in the input field
- Process Video: Click "π Process Video" to extract and analyze the transcript
- Ask Questions: Use the Q&A interface to ask about the video content
- Export Results: Export conversations in multiple formats
- Manage Sessions: Use sidebar to navigate between processed videos
ποΈ Project Structure
AI-Powered-YouTube-Transcript-Tutor/
βββ app.py # Main Streamlit application
βββ requirements.txt # Python dependencies
βββ README.md # Project documentation
βββ .env.example # Environment variables template
βββ .streamlit/
β βββ config.toml # Streamlit configuration
βββ static/
β βββ style.css # Custom CSS styling
βββ src/
β βββ __init__.py
β βββ utils/
β βββ __init__.py
β βββ youtube_handler.py # YouTube processing
β βββ text_processor.py # AI text processing
β βββ session_manager.py # Session management
β βββ export_utils.py # Export functionality
β βββ logger.py # Logging utilities
βββ config/
β βββ __init__.py
β βββ settings.py # Application settings
βββ logs/ # Application logs
π Deployment
Hugging Face Spaces
This application is optimized for deployment on Hugging Face Spaces:
- Create a new Space on Hugging Face
- Choose Streamlit SDK
- Upload all project files
- Set
OPENAI_API_KEYin Repository secrets - Your app will be live in minutes!
Local Development
streamlit run app.py --server.port 8501
π Privacy & Security
- No Data Storage: Conversations are only stored in your browser session
- Secure Processing: All API calls are made securely
- Privacy First: No personal data is collected or stored
π€ Contributing
Contributions are welcome! Please feel free to submit a Pull Request.
- Fork the repository
- Create your feature branch (
git checkout -b feature/AmazingFeature) - Commit your changes (
git commit -m 'Add some AmazingFeature') - Push to the branch (
git push origin feature/AmazingFeature) - Open a Pull Request
π License
This project is licensed under the MIT License - see the LICENSE file for details.
π Acknowledgments
- Streamlit for the amazing web app framework
- OpenAI for the powerful AI capabilities
- YouTube Transcript API for transcript extraction
π Support
If you encounter any issues or have questions, please open an issue.
Made with β€οΈ using Streamlit and OpenAI
β¨ Features
Core Functionality
- YouTube Transcript Extraction: Automatically extracts transcripts from YouTube videos
- AI-Powered Q&A: Ask questions about video content and get intelligent responses
- Multi-language Support: Supports transcripts in multiple languages
- Video Metadata Display: Shows video information including title, author, duration, and views
Enhanced UI/UX
- Modern Design: Clean, professional interface with custom CSS styling
- Responsive Layout: Works seamlessly on desktop and mobile devices
- Loading Indicators: Visual feedback during processing
- Sidebar Navigation: Easy access to processed videos and settings
- Progress Bars: Real-time processing status updates
Advanced Features
- Multiple Video Processing: Handle multiple videos in a single session
- Chat History: Persistent conversation history with export options
- Export Functionality: Export Q&A sessions as PDF, text, or JSON
- Transcript Download: Download video transcripts for offline use
- Caching System: Intelligent caching for improved performance
- Database Integration: SQLite database for storing processed videos and conversations
Technical Improvements
- Error Handling: Comprehensive error handling and user feedback
- Input Validation: Robust YouTube URL validation
- Session Management: Advanced session state management
- Logging System: Detailed logging for debugging and monitoring
- Configuration Management: Flexible configuration via YAML and environment variables
π Quick Start
Prerequisites
- Python 3.8 or higher
- OpenAI API key
- Git (for cloning the repository)
Installation
Clone the repository
git clone https://github.com/yourusername/youtube-transcript-chatbot.git cd youtube-transcript-chatbotCreate a virtual environment
python -m venv venv source venv/bin/activate # On Windows: venv\Scripts\activateInstall dependencies
pip install -r requirements.txtSet up environment variables
cp .env.template .env # Edit .env file and add your OpenAI API keyRun the application
streamlit run app.pyOpen your browser Navigate to
http://localhost:8501
π§ Configuration
Environment Variables
Create a .env file based on .env.template:
# Required
OPENAI_API_KEY=your_openai_api_key_here
# Optional
LOG_LEVEL=INFO
CACHE_DIRECTORY=cache
DATABASE_PATH=data/chatbot.db
MAX_CACHE_SIZE_MB=500
Configuration File
Modify config/config.yaml to customize application behavior:
app:
title: "AI-Powered YouTube Transcript Tutor"
description: "Ask questions from YouTube lecture transcripts using AI"
processing:
default_chunk_size: 1000
chunk_overlap: 200
supported_languages: ["en", "es", "fr", "de", "it", "pt", "ru", "ja", "ko", "zh"]
ai:
model_temperature: 0.7
max_tokens: 2000
retrieval_k: 4
π Usage Guide
Processing a Video
- Enter YouTube URL: Paste a YouTube video URL in the input field
- Click Process Video: The application will:
- Extract the video transcript
- Display video metadata
- Create an AI knowledge base
- Enable Q&A functionality
Asking Questions
- Enter your question in the text input field
- Click Ask to get an AI-generated answer
- View source references to see which parts of the transcript were used
Managing Sessions
- View processed videos in the sidebar
- Switch between videos by clicking on video titles
- Export chat history in PDF, text, or JSON format
- Clear chat history using the sidebar button
Advanced Features
- Language Selection: Choose transcript language in settings
- Export Options: Download transcripts and chat histories
- Cache Management: Automatic caching for improved performance
- Database Storage: Persistent storage of processed videos and conversations
π³ Docker Deployment
Using Docker Compose (Recommended)
Create environment file
cp .env.template .env # Add your OpenAI API key to .envBuild and run
docker-compose up -dAccess the application Open
http://localhost:8501
Using Docker
Build the image
docker build -t youtube-chatbot .Run the container
docker run -p 8501:8501 -e OPENAI_API_KEY=your_key_here youtube-chatbot
π§ͺ Testing
Run the test suite:
# Install development dependencies
pip install -e .[dev]
# Run tests
pytest
# Run tests with coverage
pytest --cov=src
# Run specific test file
pytest tests/test_youtube_handler.py
π Project Structure
youtube-transcript-chatbot/
βββ app.py # Main Streamlit application
βββ src/ # Source code
β βββ utils/ # Utility modules
β β βββ youtube_handler.py # YouTube operations
β β βββ text_processor.py # Text processing and AI
β β βββ session_manager.py # Session management
β β βββ export_utils.py # Export functionality
β β βββ database.py # Database operations
β β βββ cache_manager.py # Caching system
β β βββ logger.py # Logging configuration
βββ config/ # Configuration files
β βββ config.yaml # Application configuration
β βββ settings.py # Settings management
βββ static/ # Static assets
β βββ style.css # Custom CSS styles
βββ tests/ # Test files
βββ requirements.txt # Python dependencies
βββ .env.template # Environment template
βββ Dockerfile # Docker configuration
βββ docker-compose.yml # Docker Compose configuration
βββ README.md # This file
π Troubleshooting
Common Issues
OpenAI API Key Error
- Ensure your API key is correctly set in the
.envfile - Check that you have sufficient API credits
- Ensure your API key is correctly set in the
YouTube Video Not Found
- Verify the URL is correct and the video is public
- Some videos may have transcripts disabled
Transcript Not Available
- Try selecting a different language in settings
- Some videos may not have auto-generated transcripts
Performance Issues
- Clear cache using the sidebar option
- Reduce chunk size in configuration
- Check available disk space
Getting Help
- Check the logs in the
logs/directory - Enable debug mode by setting
LOG_LEVEL=DEBUGin.env - Review the application configuration in
config/config.yaml
π Deployment Options
Local Development
- Use
streamlit run app.pyfor development - Enable debug mode for detailed logging
Production Deployment
Streamlit Cloud
- Push code to GitHub repository
- Connect to Streamlit Cloud
- Add secrets for environment variables
Heroku
- Create
Procfile:web: streamlit run app.py --server.port=$PORT - Set environment variables in Heroku dashboard
- Deploy using Git or GitHub integration
AWS/GCP/Azure
- Use Docker container deployment
- Set up load balancer for high availability
- Configure environment variables in cloud console
π€ Contributing
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature) - Commit your changes (
git commit -m 'Add amazing feature') - Push to the branch (
git push origin feature/amazing-feature) - Open a Pull Request
π License
This project is licensed under the MIT License - see the LICENSE file for details.
π Acknowledgments
- Streamlit for the amazing web framework
- LangChain for AI/ML capabilities
- OpenAI for the language models
- YouTube Transcript API for transcript extraction
π Performance Tips
Optimization Recommendations
- Use caching: Enable vectorstore caching for frequently accessed videos
- Adjust chunk size: Smaller chunks (500-800) for better precision, larger (1200-1500) for broader context
- Monitor memory: Clear cache periodically for long-running sessions
- Database maintenance: Regularly clean up old conversations and videos
Scaling Considerations
- Horizontal scaling: Use multiple instances behind a load balancer
- Database optimization: Consider PostgreSQL for high-volume deployments
- Caching layer: Implement Redis for distributed caching
- API rate limiting: Monitor OpenAI API usage and implement rate limiting
π Support
For support, please open an issue on GitHub or contact the development team.
Made with β€οΈ by the YouTube Transcript Chatbot Team