zhijun.li
upgrade gradio
4578e32
---
title: Ernie 4.5 Video2Code
emoji: 🎬
colorFrom: blue
colorTo: indigo
sdk: gradio
sdk_version: 5.49.1
app_file: app.py
pinned: false
license: apache-2.0
short_description: Convert video tutorials into HTML/CSS using ERNIE 4.5-VL
---
# ⚑ ERNIE 4.5-VL Video-to-Code Agent
**Watch the video, write the code.**
This AI Agent uses **Baidu ERNIE 4.5-VL (Vision-Language Model)** to analyze frontend coding tutorials frame-by-frame and reconstruct the final webpage structure, styling, and logic automatically.
## ✨ Key Features
* **πŸ‘οΈ Visual Perception**: The AI "watches" the video, identifying HTML structures, CSS layouts, and interactive elements shown on screen.
* **πŸ›‘οΈ Sandbox Rendering**: Generated code is rendered inside a secure **Iframe**, allowing you to see the live result immediately without style conflicts.
* **🧹 Clean Output**: Automatically filters out conversational text to provide pure, ready-to-run HTML/CSS/JS code.
* **πŸ“¦ Single-File Download**: Get a standalone `.html` file containing all dependencies.
## πŸš€ How to Use
1. **Upload**: Drop an MP4 video file (Frontend tutorials, CSS effects, UI demos).
* *Constraint: Max video duration is **30 minutes**.*
2. **Generate**: Click **"πŸš€ Generate & Render"**.
3. **Preview**:
* **Live Preview**: See the code running instantly in the browser.
* **Source Code**: Inspect the generated HTML syntax.
4. **Download**: Save the result to your local machine.
## βš™οΈ How It Works
1. **Frame Extraction**: The video is processed using OpenCV to capture high-quality keyframes.
2. **Parallel Analysis**: ERNIE 4.5-VL processes video segments in parallel to understand the coding progression and visual outcome.
3. **Logic Synthesis**: The agent acts as a Senior Frontend Engineer, aggregating the visual insights to write functional code.