generativevideoeditorcpu

Running

App Files Files Community

generativevideoeditorcpu / CLAUDE.md

victor HF Staff

feat: Add GLM-4.7-Flash model, fix Gradio version, add CLAUDE.md

fe1f070 4 months ago

preview code

raw

history blame contribute delete

3.49 kB

	# CLAUDE.md

	This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.

	## Project Overview

	AI Video Composer is a Gradio app that generates FFmpeg commands from natural language. Users upload media files (images, videos, audio), describe what they want, and AI generates the FFmpeg command to create the output video.

	Live at: https://huggingface.co/spaces/huggingface-projects/video-composer-gpt4

	## Development Commands

	```bash
	# Setup with uv (recommended)
	uv venv .venv --python 3.12
	uv pip install -r requirements.txt
	uv pip install -e ./mediagallery

	# Run locally (requires HF_TOKEN env var)
	.venv/bin/python app.py

	# MediaGallery frontend development
	cd mediagallery/frontend
	npm install
	npm run build

	# Build MediaGallery as a package
	cd mediagallery
	python -m build
	```

	## Architecture

	### Core Flow
	```
	User uploads files → MediaGallery component
	↓
	get_files_infos() extracts metadata (dimensions, duration, audio channels)
	↓
	get_completion() sends prompt + metadata to selected model via OpenAI-compatible API
	↓
	FFmpeg command extracted from response
	↓
	Command validated with dry-run (ffmpeg -f null -)
	↓
	If invalid: retry with error feedback (max 2 attempts)
	↓
	execute_ffmpeg_command() runs with @spaces.GPU acceleration
	↓
	Output video returned
	```

	### Key Files

	\| File \| Purpose \|
	\|------\|---------\|
	\| `app.py` \| Main Gradio app, LLM integration, FFmpeg execution \|
	\| `mediagallery/` \| Custom Gradio component for mixed media (images + videos + audio) \|
	\| `mediagallery/backend/gradio_mediagallery/mediagallery.py` \| Component backend, data models, preprocess/postprocess \|
	\| `mediagallery/frontend/Index.svelte` \| Component entry point, upload handling \|
	\| `mediagallery/frontend/shared/Gallery.svelte` \| Grid layout, preview modal \|

	### MediaGallery Component

	Custom Gradio component extending Gallery to support audio files. Structure:
	- Backend: Python component class with `GalleryImage`, `GalleryVideo`, `GalleryAudio` data models
	- Frontend: Svelte 4 + TypeScript, uses `@gradio/*` packages

	The component is installed at runtime on HuggingFace Spaces via:
	```python
	subprocess.check_call([sys.executable, "-m", "pip", "install", "-q", "./mediagallery"])
	```

	### Model Configuration

	Models are configured in `app.py` under `MODELS` dict. Users can choose between models via a Radio selector in the UI. Default is GLM-4.7-Flash (first in dict):
	```python
	MODELS = {
	"zai-org/GLM-4.7-Flash": {
	"base_url": "https://router.huggingface.co/v1",
	"env_key": "HF_TOKEN",
	"model_name": "zai-org/GLM-4.7-Flash:novita",
	},
	"moonshotai/Kimi-K2-Instruct": {
	"base_url": "https://router.huggingface.co/v1",
	"env_key": "HF_TOKEN",
	"model_name": "moonshotai/Kimi-K2-Instruct-0905:groq",
	},
	}
	```

	## Constraints

	- File size limit: 100MB per file
	- Video duration limit: 2 minutes
	- Output format: Always MP4
	- Gradio version: 6.2.0
	- Cached examples: Disabled (doesn't work with Markdown output component)

	## FFmpeg Command Validation

	Commands are validated before execution using a dry-run:
	```bash
	ffmpeg -f null - [command]
	```

	If validation fails, the error is fed back to the LLM for a retry (max 2 attempts). Common issues handled:
	- Image dimension mismatches → use `scale+pad`
	- Portrait/landscape detection for standard resolutions