File size: 5,815 Bytes
ab2045a e014546 ab2045a d619c43 7cc4e00 d619c43 e775565 d619c43 e014546 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 |
---
title: YouTube Creator MetaData Extractor
emoji: π¬
colorFrom: red
colorTo: purple
sdk: gradio
sdk_version: 5.32.0
app_file: app.py
pinned: false
license: mit
---
# π¬ YouTube Creator MetaData Extractor
AI-powered tool for content creators to analyze YouTube videos and generate professional metadata using advanced language models.
## π Features
- **π Video Search**: Search YouTube videos by keywords with advanced filters
- **π Video Analysis**: Extract comprehensive video metadata (views, likes, duration, etc.)
- **π Transcript Extraction**: Get video transcripts in multiple languages
- **β±οΈ Smart Timecodes**: AI-generated timecodes for better video navigation
- **π€ Gemini AI Integration**: Advanced timecode generation using Google's Gemini 2.0
- **π Multi-language Support**: Works with videos in Ukrainian, Russian, English, and more
- **π± URL Flexibility**: Supports all YouTube URL formats (regular, shorts, embed links)
## β οΈ Cloud Platform Limitations
**YouTube blocks transcript access from cloud IPs** (Hugging Face Spaces, AWS, etc.)
**What works on HF Spaces:**
- β
Video Search
- β
Video Metadata
- β Transcripts (limited)
- β AI Timecodes (limited)
**For full functionality**, download and run locally:
```bash
git clone https://huggingface.co/spaces/dzianisBY/YouTube_Creator_MetaData
cd YouTube_Creator_MetaData
pip install -r requirements.txt
# Add your API keys to .env file
python main.py
```
## π οΈ Setup
### Required API Keys
To use this tool, you need two API keys:
1. **YouTube Data API v3 Key**
- Go to [Google Cloud Console](https://console.developers.google.com/)
- Create a new project or select existing
- Enable "YouTube Data API v3"
- Create credentials (API Key)
2. **Gemini API Key** (for AI features)
- Visit [Google AI Studio](https://ai.google.dev/)
- Get your free API key for Gemini
### Environment Variables
Set these in your Hugging Face Space settings:
```
YOUTUBE_API_KEY=your_youtube_api_key_here
GEMINI_API_KEY=your_gemini_api_key_here
```
## π How to Use
### 1. Video Search
- Enter keywords to find YouTube videos
- Filter by upload date, view count, duration
- Get detailed metadata for any video
### 2. Transcript Analysis
- Extract transcripts from videos with subtitles
- Support for auto-generated and manual captions
- Multiple language detection and support
### 3. Timecode Generation
**Basic Timecodes**: Algorithmic segmentation based on transcript timing
**AI Timecodes**: Intelligent topic-based segmentation using Gemini AI
**Supported Formats**:
- **YouTube**: Ready for video descriptions (e.g., `05:30 Topic description`)
- **Markdown**: Clickable links with timestamps (e.g., `- [05:30](link) Topic`)
**Language Codes**:
- `uk` - Ukrainian
- `ru` - Russian
- `en` - English
- And many others (ISO 639-1 standard)
## π§ API Reference
This application provides both a web interface and REST API endpoints:
### Search Videos
```http
POST /api/search
{
"query": "your search query",
"max_results": 10,
"order": "relevance"
}
```
### Get Video Info
```http
POST /api/video_info
{
"video_id": "video_id_or_full_url"
}
```
### Extract Transcript
```http
POST /api/transcript
{
"video_id": "video_id_or_full_url",
"language_code": "uk"
}
```
### Generate AI Timecodes
```http
POST /api/gemini_timecodes
{
"video_id": "video_id_or_full_url",
"language_code": "uk",
"format": "youtube",
"model": "gemini-2.0-flash-001"
}
```
## ποΈ Architecture
- **Frontend**: Gradio web interface with responsive design
- **Backend**: FastAPI server with async processing
- **AI Integration**: Google Gemini 2.0 for intelligent content analysis
- **APIs**: YouTube Data API v3 for video metadata
- **Transcript**: YouTube Transcript API for subtitle extraction
## π Project Structure
```
βββ main.py # Unified launcher (API/UI/both modes)
βββ run_telegram_bot.py # Telegram bot launcher
βββ api_server.py # FastAPI backend server
βββ telegram_bot.py # Telegram bot implementation
βββ mcp_handlers.py # Model Context Protocol handlers
βββ gemini_helper.py # Gemini AI integration
βββ utils.py # Utility functions
βββ models.py # Data models
βββ app.py # Gradio app (HF Spaces entry point)
βββ gradio_app.py # Extended Gradio interface
βββ requirements.txt # Python dependencies
βββ telegram_requirements.txt # Telegram bot dependencies
βββ cloudflare-config.yml # Cloudflare tunnel configuration
βββ TUNNEL_SOLUTIONS.md # Tunnel troubleshooting guide
βββ youtube-content-metagen-agent.ipynb # Kaggle reference notebook
βββ README.md # This file
```
## π¬ Technology Stack
- **Python 3.13+**
- **Gradio** - Web interface framework
- **FastAPI** - High-performance API framework
- **Google Gemini 2.0** - Advanced language model for content analysis
- **YouTube APIs** - Official Google APIs for video data
- **AsyncIO** - Asynchronous processing for better performance
## π Use Cases
- **Content Creators**: Generate professional timecodes for YouTube videos
- **Educators**: Extract and analyze educational content structure
- **Researchers**: Analyze video metadata and transcripts at scale
- **Marketers**: Research competitor content and trends
- **Accessibility**: Create better navigation for long-form content
## π License
MIT License - feel free to use in your projects!
## π€ Contributing
Contributions welcome! This project is designed to help content creators worldwide.
---
**Made with β€οΈ for the YouTube creator community** |