Spaces:

midlajvalappil
/

AI-Powered-YouTube-Transcript-Tutor

Sleeping

App Files Files Community

AI-Powered-YouTube-Transcript-Tutor / README.md

midlajvalappil

Update README.md

3a1632a verified 8 months ago

preview code

raw

history blame contribute delete

15.3 kB

A newer version of the Streamlit SDK is available: 1.55.0

Upgrade

metadata

title: AI Powered YouTube Transcript Tutor
emoji: 🚀
colorFrom: red
colorTo: red
sdk: streamlit
app_file: src/streamlit_app.py
app_port: 8501
tags:
  - streamlit
pinned: false
short_description: Streamlit template space
license: mit

🎓 AI-Powered YouTube Transcript Tutor

A sophisticated Streamlit application that transforms YouTube videos into interactive learning experiences using AI. Ask questions about video content and get intelligent answers based on the transcript.

🚀 Live Demo

Try the app now: https://ai-powered-youtube-transcript-tutor.streamlit.app/

Experience the full functionality without any setup required!

🌟 Features

Core Functionality

YouTube Transcript Extraction: Automatically extracts transcripts from YouTube videos
AI-Powered Q&A: Ask questions about video content and get intelligent responses
Multi-language Support: Supports transcripts in multiple languages
Video Metadata Display: Shows video information including title, author, duration, and views

Enhanced UI/UX

Modern Dark Theme: Clean, professional interface with dark theme
Responsive Layout: Works seamlessly on desktop and mobile devices
Loading Indicators: Visual feedback during processing
Sidebar Navigation: Easy access to processed videos and settings
Progress Bars: Real-time processing status updates

Advanced Features

Multiple Video Processing: Handle multiple videos in a single session
Chat History: Persistent conversation history with export options
Export Functionality: Export Q&A sessions as PDF, text, or JSON
Transcript Download: Download video transcripts for offline use
Fallback System: Works even when OpenAI API quota is exceeded
Session Management: Advanced session state management

🚀 Quick Start

💡 Want to try it first? Check out the live demo - no installation required!

Prerequisites

Python 3.8 or higher
OpenAI API key

Installation

Clone the repository

git clone https://github.com/midlaj-muhammed/AI-Powered-YouTube-Transcript-Tutor.git
cd AI-Powered-YouTube-Transcript-Tutor

Create virtual environment

python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

Install dependencies
```
pip install -r requirements.txt
```

Set up environment variables

# Create .env file
echo "OPENAI_API_KEY=your_openai_api_key_here" > .env

Run the application
```
streamlit run app.py
```

🔧 Configuration

Environment Variables

OPENAI_API_KEY: Your OpenAI API key for AI-powered responses

Streamlit Configuration

The app includes custom Streamlit configuration in .streamlit/config.toml for optimal performance.

📱 Usage

Enter YouTube URL: Paste any YouTube video URL in the input field
Process Video: Click "🚀 Process Video" to extract and analyze the transcript
Ask Questions: Use the Q&A interface to ask about the video content
Export Results: Export conversations in multiple formats
Manage Sessions: Use sidebar to navigate between processed videos

🏗️ Project Structure

AI-Powered-YouTube-Transcript-Tutor/
├── app.py                      # Main Streamlit application
├── requirements.txt            # Python dependencies
├── README.md                   # Project documentation
├── .env.example               # Environment variables template
├── .streamlit/
│   └── config.toml            # Streamlit configuration
├── static/
│   └── style.css              # Custom CSS styling
├── src/
│   ├── __init__.py
│   └── utils/
│       ├── __init__.py
│       ├── youtube_handler.py  # YouTube processing
│       ├── text_processor.py   # AI text processing
│       ├── session_manager.py  # Session management
│       ├── export_utils.py     # Export functionality
│       └── logger.py          # Logging utilities
├── config/
│   ├── __init__.py
│   └── settings.py            # Application settings
└── logs/                      # Application logs

🌐 Deployment

Hugging Face Spaces

This application is optimized for deployment on Hugging Face Spaces:

Create a new Space on Hugging Face
Choose Streamlit SDK
Upload all project files
Set OPENAI_API_KEY in Repository secrets
Your app will be live in minutes!

Local Development

streamlit run app.py --server.port 8501

🔒 Privacy & Security

No Data Storage: Conversations are only stored in your browser session
Secure Processing: All API calls are made securely
Privacy First: No personal data is collected or stored

🤝 Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

Fork the repository
Create your feature branch (git checkout -b feature/AmazingFeature)
Commit your changes (git commit -m 'Add some AmazingFeature')
Push to the branch (git push origin feature/AmazingFeature)
Open a Pull Request

📄 License

This project is licensed under the MIT License - see the LICENSE file for details.

🙏 Acknowledgments

Streamlit for the amazing web app framework
OpenAI for the powerful AI capabilities
YouTube Transcript API for transcript extraction

📞 Support

If you encounter any issues or have questions, please open an issue.

Made with ❤️ using Streamlit and OpenAI

✨ Features

Core Functionality

YouTube Transcript Extraction: Automatically extracts transcripts from YouTube videos
AI-Powered Q&A: Ask questions about video content and get intelligent responses
Multi-language Support: Supports transcripts in multiple languages
Video Metadata Display: Shows video information including title, author, duration, and views

Enhanced UI/UX

Modern Design: Clean, professional interface with custom CSS styling
Responsive Layout: Works seamlessly on desktop and mobile devices
Loading Indicators: Visual feedback during processing
Sidebar Navigation: Easy access to processed videos and settings
Progress Bars: Real-time processing status updates

Advanced Features

Multiple Video Processing: Handle multiple videos in a single session
Chat History: Persistent conversation history with export options
Export Functionality: Export Q&A sessions as PDF, text, or JSON
Transcript Download: Download video transcripts for offline use
Caching System: Intelligent caching for improved performance
Database Integration: SQLite database for storing processed videos and conversations

Technical Improvements

Error Handling: Comprehensive error handling and user feedback
Input Validation: Robust YouTube URL validation
Session Management: Advanced session state management
Logging System: Detailed logging for debugging and monitoring
Configuration Management: Flexible configuration via YAML and environment variables

🚀 Quick Start

Prerequisites

Python 3.8 or higher
OpenAI API key
Git (for cloning the repository)

Installation

Clone the repository

git clone https://github.com/yourusername/youtube-transcript-chatbot.git
cd youtube-transcript-chatbot

Create a virtual environment

python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

Install dependencies
```
pip install -r requirements.txt
```

Set up environment variables

cp .env.template .env
# Edit .env file and add your OpenAI API key

Run the application
```
streamlit run app.py
```
Open your browser Navigate to http://localhost:8501

🔧 Configuration

Environment Variables

Create a .env file based on .env.template:

# Required
OPENAI_API_KEY=your_openai_api_key_here

# Optional
LOG_LEVEL=INFO
CACHE_DIRECTORY=cache
DATABASE_PATH=data/chatbot.db
MAX_CACHE_SIZE_MB=500

Configuration File

Modify config/config.yaml to customize application behavior:

app:
  title: "AI-Powered YouTube Transcript Tutor"
  description: "Ask questions from YouTube lecture transcripts using AI"

processing:
  default_chunk_size: 1000
  chunk_overlap: 200
  supported_languages: ["en", "es", "fr", "de", "it", "pt", "ru", "ja", "ko", "zh"]

ai:
  model_temperature: 0.7
  max_tokens: 2000
  retrieval_k: 4

📖 Usage Guide

Processing a Video

Enter YouTube URL: Paste a YouTube video URL in the input field
Click Process Video: The application will:
- Extract the video transcript
- Display video metadata
- Create an AI knowledge base
- Enable Q&A functionality

Asking Questions

Enter your question in the text input field
Click Ask to get an AI-generated answer
View source references to see which parts of the transcript were used

Managing Sessions

View processed videos in the sidebar
Switch between videos by clicking on video titles
Export chat history in PDF, text, or JSON format
Clear chat history using the sidebar button

Advanced Features

Language Selection: Choose transcript language in settings
Export Options: Download transcripts and chat histories
Cache Management: Automatic caching for improved performance
Database Storage: Persistent storage of processed videos and conversations

🐳 Docker Deployment

Using Docker Compose (Recommended)

Create environment file

cp .env.template .env
# Add your OpenAI API key to .env

Build and run
```
docker-compose up -d
```
Access the application Open http://localhost:8501

Using Docker

Build the image
```
docker build -t youtube-chatbot .
```

Run the container

docker run -p 8501:8501 -e OPENAI_API_KEY=your_key_here youtube-chatbot

🧪 Testing

Run the test suite:

# Install development dependencies
pip install -e .[dev]

# Run tests
pytest

# Run tests with coverage
pytest --cov=src

# Run specific test file
pytest tests/test_youtube_handler.py

📁 Project Structure

youtube-transcript-chatbot/
├── app.py                      # Main Streamlit application
├── src/                        # Source code
│   ├── utils/                  # Utility modules
│   │   ├── youtube_handler.py  # YouTube operations
│   │   ├── text_processor.py   # Text processing and AI
│   │   ├── session_manager.py  # Session management
│   │   ├── export_utils.py     # Export functionality
│   │   ├── database.py         # Database operations
│   │   ├── cache_manager.py    # Caching system
│   │   └── logger.py           # Logging configuration
├── config/                     # Configuration files
│   ├── config.yaml            # Application configuration
│   └── settings.py            # Settings management
├── static/                     # Static assets
│   └── style.css              # Custom CSS styles
├── tests/                      # Test files
├── requirements.txt           # Python dependencies
├── .env.template             # Environment template
├── Dockerfile                # Docker configuration
├── docker-compose.yml        # Docker Compose configuration
└── README.md                 # This file

🔍 Troubleshooting

Common Issues

OpenAI API Key Error
- Ensure your API key is correctly set in the .env file
- Check that you have sufficient API credits
YouTube Video Not Found
- Verify the URL is correct and the video is public
- Some videos may have transcripts disabled
Transcript Not Available
- Try selecting a different language in settings
- Some videos may not have auto-generated transcripts
Performance Issues
- Clear cache using the sidebar option
- Reduce chunk size in configuration
- Check available disk space

Getting Help

Check the logs in the logs/ directory
Enable debug mode by setting LOG_LEVEL=DEBUG in .env
Review the application configuration in config/config.yaml

🚀 Deployment Options

Local Development

Use streamlit run app.py for development
Enable debug mode for detailed logging

Production Deployment

Streamlit Cloud

Push code to GitHub repository
Connect to Streamlit Cloud
Add secrets for environment variables

Heroku

Create Procfile: web: streamlit run app.py --server.port=$PORT
Set environment variables in Heroku dashboard
Deploy using Git or GitHub integration

AWS/GCP/Azure

Use Docker container deployment
Set up load balancer for high availability
Configure environment variables in cloud console

🤝 Contributing

Fork the repository
Create a feature branch (git checkout -b feature/amazing-feature)
Commit your changes (git commit -m 'Add amazing feature')
Push to the branch (git push origin feature/amazing-feature)
Open a Pull Request

📄 License

This project is licensed under the MIT License - see the LICENSE file for details.

🙏 Acknowledgments

Streamlit for the amazing web framework
LangChain for AI/ML capabilities
OpenAI for the language models
YouTube Transcript API for transcript extraction

📊 Performance Tips

Optimization Recommendations

Use caching: Enable vectorstore caching for frequently accessed videos
Adjust chunk size: Smaller chunks (500-800) for better precision, larger (1200-1500) for broader context
Monitor memory: Clear cache periodically for long-running sessions
Database maintenance: Regularly clean up old conversations and videos

Scaling Considerations

Horizontal scaling: Use multiple instances behind a load balancer
Database optimization: Consider PostgreSQL for high-volume deployments
Caching layer: Implement Redis for distributed caching
API rate limiting: Monitor OpenAI API usage and implement rate limiting

📞 Support

For support, please open an issue on GitHub or contact the development team.

Made with ❤️ by the YouTube Transcript Chatbot Team