video_analyzer / README.md
Claude
refactor: Switch to Gradio 6 and simplify proxy config
091d83c unverified

A newer version of the Gradio SDK is available: 6.7.0

Upgrade
metadata
title: Video Analyzer
emoji: 🎬
colorFrom: blue
colorTo: purple
sdk: gradio
sdk_version: 6.2.0
python_version: '3.11'
app_file: app.py
pinned: false
license: mit
suggested_hardware: zero-a10g
suggested_storage: small
hf_oauth: true
hf_oauth_scopes:
  - inference-api
hf_oauth_expiration_minutes: 480
tags:
  - video
  - youtube
  - transcription
  - whisper
  - rag
  - chatbot
models:
  - openai/whisper-base
  - Salesforce/blip-image-captioning-base
  - Qwen/Qwen2.5-72B-Instruct
short_description: Download, transcribe, and chat with YouTube videos using AI

Video Analyzer

A conversational AI assistant that analyzes YouTube videos and answers questions about their content.

Features

Core Capabilities

  • YouTube Video Download: Supports videos, playlists, and shorts via yt-dlp
  • Speech-to-Text: Automatic transcription using OpenAI Whisper (whisper-base)
  • Visual Analysis: Key frame extraction and captioning with BLIP
  • Knowledge Base: Per-session vector storage with ChromaDB for semantic search
  • RAG Chatbot: Ask questions about your videos using Qwen2.5-72B-Instruct

Voice Interaction

  • Voice Input: Speak your questions using the microphone (transcribed with Whisper)
  • Voice Output: Hear responses read aloud with natural TTS
  • Dual TTS Engines:
    • Edge-TTS (default): Fast, free Microsoft voices, no GPU needed
    • Parler-TTS (optional): SOTA quality with GPU, HuggingFace model

User Experience

  • Unified Chat Interface: Single chatbot handles both video analysis and Q&A
  • Auto URL Detection: Just paste a YouTube URL and the assistant analyzes it
  • Conversational Flow: The assistant guides you through the process
  • Per-Session Storage: Your analyzed videos are private to your session
  • Persistent Sessions: Your knowledge base persists across page reloads (tied to your HuggingFace profile)

Technical Features

  • ZeroGPU Support: Leverages HuggingFace ZeroGPU for faster GPU-accelerated processing
  • Model Fallback: Automatic fallback chain (Qwen2.5-72B → Llama-3.1-70B) for reliability
  • HuggingFace OAuth: Secure authentication via HuggingFace login
  • Gradio 6: Modern UI with the Soft theme

How to Use

  1. Sign in with your HuggingFace account using the button in the top right
  2. Paste a YouTube URL directly in the chat (e.g., https://youtube.com/watch?v=...)
  3. Wait for processing - the assistant will transcribe audio and analyze key frames
  4. Ask questions about the video content in natural language

Example Interactions

You: https://youtube.com/watch?v=dQw4w9WgXcQ
Bot: I'll analyze that video for you. This may take a few minutes...
Bot: Done! I've analyzed "Never Gonna Give You Up" and added it to my knowledge base.

You: What is this video about?
Bot: Based on the transcript, this video is a music video for Rick Astley's 1987 hit song...

You: What visual elements were shown?
Bot: The video shows a man dancing in various locations...

Tech Stack

Component Technology
Web Framework Gradio 6 with OAuth
Speech Recognition OpenAI Whisper (whisper-base)
Image Captioning Salesforce BLIP
Vector Database ChromaDB (in-memory, per-session)
Text Embeddings Sentence Transformers (all-MiniLM-L6-v2)
Language Model HuggingFace Inference API (Qwen2.5-72B-Instruct)
Video Download yt-dlp
GPU Acceleration HuggingFace ZeroGPU (A10G)

Limitations

  • Works best with videos under 10 minutes
  • Requires HuggingFace login for authentication
  • Knowledge base is session-based (stored in memory, not persistent across Space restarts)
  • Audio extraction requires FFmpeg (pre-installed on HuggingFace Spaces)

Development

Prerequisites

  • Python 3.11+
  • uv package manager
  • FFmpeg

Setup

# Install dependencies
uv sync

# Install dev dependencies
uv sync --extra dev

# Run the app locally
uv run python app.py

Testing

# Run unit tests
uv run --extra dev pytest tests/test_app.py -v

# Run E2E tests (requires playwright browsers)
uv run --extra dev playwright install
uv run --extra dev pytest tests/test_e2e.py -v

License

MIT