--- title: YouTube Creator MetaData Extractor emoji: 🎬 colorFrom: red colorTo: purple sdk: gradio sdk_version: 5.32.0 app_file: app.py pinned: false license: mit --- # 🎬 YouTube Creator MetaData Extractor AI-powered tool for content creators to analyze YouTube videos and generate professional metadata using advanced language models. ## 🚀 Features - **🔍 Video Search**: Search YouTube videos by keywords with advanced filters - **📊 Video Analysis**: Extract comprehensive video metadata (views, likes, duration, etc.) - **📝 Transcript Extraction**: Get video transcripts in multiple languages - **⏱️ Smart Timecodes**: AI-generated timecodes for better video navigation - **🤖 Gemini AI Integration**: Advanced timecode generation using Google's Gemini 2.0 - **🌐 Multi-language Support**: Works with videos in Ukrainian, Russian, English, and more - **📱 URL Flexibility**: Supports all YouTube URL formats (regular, shorts, embed links) ## ⚠️ Cloud Platform Limitations **YouTube blocks transcript access from cloud IPs** (Hugging Face Spaces, AWS, etc.) **What works on HF Spaces:** - ✅ Video Search - ✅ Video Metadata - ❌ Transcripts (limited) - ❌ AI Timecodes (limited) **For full functionality**, download and run locally: ```bash git clone https://huggingface.co/spaces/dzianisBY/YouTube_Creator_MetaData cd YouTube_Creator_MetaData pip install -r requirements.txt # Add your API keys to .env file python main.py ``` ## 🛠️ Setup ### Required API Keys To use this tool, you need two API keys: 1. **YouTube Data API v3 Key** - Go to [Google Cloud Console](https://console.developers.google.com/) - Create a new project or select existing - Enable "YouTube Data API v3" - Create credentials (API Key) 2. **Gemini API Key** (for AI features) - Visit [Google AI Studio](https://ai.google.dev/) - Get your free API key for Gemini ### Environment Variables Set these in your Hugging Face Space settings: ``` YOUTUBE_API_KEY=your_youtube_api_key_here GEMINI_API_KEY=your_gemini_api_key_here ``` ## 📖 How to Use ### 1. Video Search - Enter keywords to find YouTube videos - Filter by upload date, view count, duration - Get detailed metadata for any video ### 2. Transcript Analysis - Extract transcripts from videos with subtitles - Support for auto-generated and manual captions - Multiple language detection and support ### 3. Timecode Generation **Basic Timecodes**: Algorithmic segmentation based on transcript timing **AI Timecodes**: Intelligent topic-based segmentation using Gemini AI **Supported Formats**: - **YouTube**: Ready for video descriptions (e.g., `05:30 Topic description`) - **Markdown**: Clickable links with timestamps (e.g., `- [05:30](link) Topic`) **Language Codes**: - `uk` - Ukrainian - `ru` - Russian - `en` - English - And many others (ISO 639-1 standard) ## 🔧 API Reference This application provides both a web interface and REST API endpoints: ### Search Videos ```http POST /api/search { "query": "your search query", "max_results": 10, "order": "relevance" } ``` ### Get Video Info ```http POST /api/video_info { "video_id": "video_id_or_full_url" } ``` ### Extract Transcript ```http POST /api/transcript { "video_id": "video_id_or_full_url", "language_code": "uk" } ``` ### Generate AI Timecodes ```http POST /api/gemini_timecodes { "video_id": "video_id_or_full_url", "language_code": "uk", "format": "youtube", "model": "gemini-2.0-flash-001" } ``` ## 🏗️ Architecture - **Frontend**: Gradio web interface with responsive design - **Backend**: FastAPI server with async processing - **AI Integration**: Google Gemini 2.0 for intelligent content analysis - **APIs**: YouTube Data API v3 for video metadata - **Transcript**: YouTube Transcript API for subtitle extraction ## 📁 Project Structure ``` ├── main.py # Unified launcher (API/UI/both modes) ├── run_telegram_bot.py # Telegram bot launcher ├── api_server.py # FastAPI backend server ├── telegram_bot.py # Telegram bot implementation ├── mcp_handlers.py # Model Context Protocol handlers ├── gemini_helper.py # Gemini AI integration ├── utils.py # Utility functions ├── models.py # Data models ├── app.py # Gradio app (HF Spaces entry point) ├── gradio_app.py # Extended Gradio interface ├── requirements.txt # Python dependencies ├── telegram_requirements.txt # Telegram bot dependencies ├── cloudflare-config.yml # Cloudflare tunnel configuration ├── TUNNEL_SOLUTIONS.md # Tunnel troubleshooting guide ├── youtube-content-metagen-agent.ipynb # Kaggle reference notebook └── README.md # This file ``` ## 🔬 Technology Stack - **Python 3.13+** - **Gradio** - Web interface framework - **FastAPI** - High-performance API framework - **Google Gemini 2.0** - Advanced language model for content analysis - **YouTube APIs** - Official Google APIs for video data - **AsyncIO** - Asynchronous processing for better performance ## 🌟 Use Cases - **Content Creators**: Generate professional timecodes for YouTube videos - **Educators**: Extract and analyze educational content structure - **Researchers**: Analyze video metadata and transcripts at scale - **Marketers**: Research competitor content and trends - **Accessibility**: Create better navigation for long-form content ## 📄 License MIT License - feel free to use in your projects! ## 🤝 Contributing Contributions welcome! This project is designed to help content creators worldwide. --- **Made with ❤️ for the YouTube creator community**