---
title: Video Analyzer
emoji: "🎬"
colorFrom: blue
colorTo: purple
sdk: gradio
sdk_version: "6.2.0"
python_version: "3.11"
app_file: app.py
pinned: false
license: mit
suggested_hardware: zero-a10g
suggested_storage: small
hf_oauth: true
hf_oauth_scopes:
  - inference-api
hf_oauth_expiration_minutes: 480
tags:
  - video
  - youtube
  - transcription
  - whisper
  - rag
  - chatbot
models:
  - openai/whisper-base
  - Salesforce/blip-image-captioning-base
  - Qwen/Qwen2.5-72B-Instruct
short_description: Download, transcribe, and chat with YouTube videos using AI
---

# Video Analyzer

A conversational AI assistant that analyzes YouTube videos and answers questions about their content.

## Features

### Core Capabilities
- **YouTube Video Download**: Supports videos, playlists, and shorts via yt-dlp
- **Speech-to-Text**: Automatic transcription using OpenAI Whisper (whisper-base)
- **Visual Analysis**: Key frame extraction and captioning with BLIP
- **Knowledge Base**: Per-session vector storage with ChromaDB for semantic search
- **RAG Chatbot**: Ask questions about your videos using Qwen2.5-72B-Instruct

### Voice Interaction
- **Voice Input**: Speak your questions using the microphone (transcribed with Whisper)
- **Voice Output**: Hear responses read aloud with natural TTS
- **Dual TTS Engines**:
  - **Edge-TTS** (default): Fast, free Microsoft voices, no GPU needed
  - **Parler-TTS** (optional): SOTA quality with GPU, HuggingFace model

### User Experience
- **Unified Chat Interface**: Single chatbot handles both video analysis and Q&A
- **Auto URL Detection**: Just paste a YouTube URL and the assistant analyzes it
- **Conversational Flow**: The assistant guides you through the process
- **Per-Session Storage**: Your analyzed videos are private to your session
- **Persistent Sessions**: Your knowledge base persists across page reloads (tied to your HuggingFace profile)

### Technical Features
- **ZeroGPU Support**: Leverages HuggingFace ZeroGPU for faster GPU-accelerated processing
- **Model Fallback**: Automatic fallback chain (Qwen2.5-72B → Llama-3.1-70B) for reliability
- **HuggingFace OAuth**: Secure authentication via HuggingFace login
- **Gradio 6**: Modern UI with the Soft theme

## How to Use

1. **Sign in** with your HuggingFace account using the button in the top right
2. **Paste** a YouTube URL directly in the chat (e.g., `https://youtube.com/watch?v=...`)
3. **Wait** for processing - the assistant will transcribe audio and analyze key frames
4. **Ask questions** about the video content in natural language

### Example Interactions

```
You: https://youtube.com/watch?v=dQw4w9WgXcQ
Bot: I'll analyze that video for you. This may take a few minutes...
Bot: Done! I've analyzed "Never Gonna Give You Up" and added it to my knowledge base.

You: What is this video about?
Bot: Based on the transcript, this video is a music video for Rick Astley's 1987 hit song...

You: What visual elements were shown?
Bot: The video shows a man dancing in various locations...
```

## Tech Stack

| Component | Technology |
|-----------|------------|
| Web Framework | Gradio 6 with OAuth |
| Speech Recognition | OpenAI Whisper (whisper-base) |
| Image Captioning | Salesforce BLIP |
| Vector Database | ChromaDB (in-memory, per-session) |
| Text Embeddings | Sentence Transformers (all-MiniLM-L6-v2) |
| Language Model | HuggingFace Inference API (Qwen2.5-72B-Instruct) |
| Video Download | yt-dlp |
| GPU Acceleration | HuggingFace ZeroGPU (A10G) |

## Limitations

- Works best with videos under 10 minutes
- Requires HuggingFace login for authentication
- Knowledge base is session-based (stored in memory, not persistent across Space restarts)
- Audio extraction requires FFmpeg (pre-installed on HuggingFace Spaces)

## Development

### Prerequisites
- Python 3.11+
- uv package manager
- FFmpeg

### Setup
```bash
# Install dependencies
uv sync

# Install dev dependencies
uv sync --extra dev

# Run the app locally
uv run python app.py
```

### Testing
```bash
# Run unit tests
uv run --extra dev pytest tests/test_app.py -v

# Run E2E tests (requires playwright browsers)
uv run --extra dev playwright install
uv run --extra dev pytest tests/test_e2e.py -v
```

## License

MIT

<!-- Build: 2025-12-28 -->