A newer version of the Gradio SDK is available:
6.2.0
title: YouTube Creator MetaData Extractor
emoji: π¬
colorFrom: red
colorTo: purple
sdk: gradio
sdk_version: 5.32.0
app_file: app.py
pinned: false
license: mit
π¬ YouTube Creator MetaData Extractor
AI-powered tool for content creators to analyze YouTube videos and generate professional metadata using advanced language models.
π Features
- π Video Search: Search YouTube videos by keywords with advanced filters
- π Video Analysis: Extract comprehensive video metadata (views, likes, duration, etc.)
- π Transcript Extraction: Get video transcripts in multiple languages
- β±οΈ Smart Timecodes: AI-generated timecodes for better video navigation
- π€ Gemini AI Integration: Advanced timecode generation using Google's Gemini 2.0
- π Multi-language Support: Works with videos in Ukrainian, Russian, English, and more
- π± URL Flexibility: Supports all YouTube URL formats (regular, shorts, embed links)
β οΈ Cloud Platform Limitations
YouTube blocks transcript access from cloud IPs (Hugging Face Spaces, AWS, etc.)
What works on HF Spaces:
- β Video Search
- β Video Metadata
- β Transcripts (limited)
- β AI Timecodes (limited)
For full functionality, download and run locally:
git clone https://huggingface.co/spaces/dzianisBY/YouTube_Creator_MetaData
cd YouTube_Creator_MetaData
pip install -r requirements.txt
# Add your API keys to .env file
python main.py
π οΈ Setup
Required API Keys
To use this tool, you need two API keys:
YouTube Data API v3 Key
- Go to Google Cloud Console
- Create a new project or select existing
- Enable "YouTube Data API v3"
- Create credentials (API Key)
Gemini API Key (for AI features)
- Visit Google AI Studio
- Get your free API key for Gemini
Environment Variables
Set these in your Hugging Face Space settings:
YOUTUBE_API_KEY=your_youtube_api_key_here
GEMINI_API_KEY=your_gemini_api_key_here
π How to Use
1. Video Search
- Enter keywords to find YouTube videos
- Filter by upload date, view count, duration
- Get detailed metadata for any video
2. Transcript Analysis
- Extract transcripts from videos with subtitles
- Support for auto-generated and manual captions
- Multiple language detection and support
3. Timecode Generation
Basic Timecodes: Algorithmic segmentation based on transcript timing AI Timecodes: Intelligent topic-based segmentation using Gemini AI
Supported Formats:
- YouTube: Ready for video descriptions (e.g.,
05:30 Topic description) - Markdown: Clickable links with timestamps (e.g.,
- [05:30](link) Topic)
Language Codes:
uk- Ukrainianru- Russianen- English- And many others (ISO 639-1 standard)
π§ API Reference
This application provides both a web interface and REST API endpoints:
Search Videos
POST /api/search
{
"query": "your search query",
"max_results": 10,
"order": "relevance"
}
Get Video Info
POST /api/video_info
{
"video_id": "video_id_or_full_url"
}
Extract Transcript
POST /api/transcript
{
"video_id": "video_id_or_full_url",
"language_code": "uk"
}
Generate AI Timecodes
POST /api/gemini_timecodes
{
"video_id": "video_id_or_full_url",
"language_code": "uk",
"format": "youtube",
"model": "gemini-2.0-flash-001"
}
ποΈ Architecture
- Frontend: Gradio web interface with responsive design
- Backend: FastAPI server with async processing
- AI Integration: Google Gemini 2.0 for intelligent content analysis
- APIs: YouTube Data API v3 for video metadata
- Transcript: YouTube Transcript API for subtitle extraction
π Project Structure
βββ main.py # Unified launcher (API/UI/both modes)
βββ run_telegram_bot.py # Telegram bot launcher
βββ api_server.py # FastAPI backend server
βββ telegram_bot.py # Telegram bot implementation
βββ mcp_handlers.py # Model Context Protocol handlers
βββ gemini_helper.py # Gemini AI integration
βββ utils.py # Utility functions
βββ models.py # Data models
βββ app.py # Gradio app (HF Spaces entry point)
βββ gradio_app.py # Extended Gradio interface
βββ requirements.txt # Python dependencies
βββ telegram_requirements.txt # Telegram bot dependencies
βββ cloudflare-config.yml # Cloudflare tunnel configuration
βββ TUNNEL_SOLUTIONS.md # Tunnel troubleshooting guide
βββ youtube-content-metagen-agent.ipynb # Kaggle reference notebook
βββ README.md # This file
π¬ Technology Stack
- Python 3.13+
- Gradio - Web interface framework
- FastAPI - High-performance API framework
- Google Gemini 2.0 - Advanced language model for content analysis
- YouTube APIs - Official Google APIs for video data
- AsyncIO - Asynchronous processing for better performance
π Use Cases
- Content Creators: Generate professional timecodes for YouTube videos
- Educators: Extract and analyze educational content structure
- Researchers: Analyze video metadata and transcripts at scale
- Marketers: Research competitor content and trends
- Accessibility: Create better navigation for long-form content
π License
MIT License - feel free to use in your projects!
π€ Contributing
Contributions welcome! This project is designed to help content creators worldwide.
Made with β€οΈ for the YouTube creator community