A newer version of the Gradio SDK is available:
6.4.0
metadata
title: Ernie 4.5 Video2Code
emoji: 🎬
colorFrom: blue
colorTo: indigo
sdk: gradio
sdk_version: 5.49.1
app_file: app.py
pinned: false
license: apache-2.0
short_description: Convert video tutorials into HTML/CSS using ERNIE 4.5-VL
⚡ ERNIE 4.5-VL Video-to-Code Agent
Watch the video, write the code.
This AI Agent uses Baidu ERNIE 4.5-VL (Vision-Language Model) to analyze frontend coding tutorials frame-by-frame and reconstruct the final webpage structure, styling, and logic automatically.
✨ Key Features
- 👁️ Visual Perception: The AI "watches" the video, identifying HTML structures, CSS layouts, and interactive elements shown on screen.
- 🛡️ Sandbox Rendering: Generated code is rendered inside a secure Iframe, allowing you to see the live result immediately without style conflicts.
- 🧹 Clean Output: Automatically filters out conversational text to provide pure, ready-to-run HTML/CSS/JS code.
- 📦 Single-File Download: Get a standalone
.htmlfile containing all dependencies.
🚀 How to Use
- Upload: Drop an MP4 video file (Frontend tutorials, CSS effects, UI demos).
- Constraint: Max video duration is 30 minutes.
- Generate: Click "🚀 Generate & Render".
- Preview:
- Live Preview: See the code running instantly in the browser.
- Source Code: Inspect the generated HTML syntax.
- Download: Save the result to your local machine.
⚙️ How It Works
- Frame Extraction: The video is processed using OpenCV to capture high-quality keyframes.
- Parallel Analysis: ERNIE 4.5-VL processes video segments in parallel to understand the coding progression and visual outcome.
- Logic Synthesis: The agent acts as a Senior Frontend Engineer, aggregating the visual insights to write functional code.