CineGen

Sleeping

App Files Files Community

CineGen / README.md

VirtualOasis

readme

37caa62 6 months ago

preview code

raw

history blame contribute delete

3.92 kB

	---
	title: CineGen
	emoji: 👀
	colorFrom: pink
	colorTo: purple
	sdk: gradio
	sdk_version: 6.0.1
	app_file: app.py
	pinned: false
	short_description: automate the process of short movie creation
	tags:
	- mcp-in-action-track-creative
	---
	CineGen AI Director is an AI agent designed to automate the process of short movie creation. It transforms a simple text or image idea into a fully realized video production by handling scriptwriting, storyboard generation, character design, and video synthesis using a multi-model approach.

	- Sponsor Platforms: Uses Google Gemini (story + character prompts) and Hugging Face Inference Client with fal.ai hosting for Wan 2.2 TI2V video renders;
	- Autonomous Agent Flow: StoryGenerator → CharacterDesigner → VideoDirector pipeline runs sequentially inside a single Gradio Blocks app, with MCP-friendly abstractions (`StoryGenerator`, `CharacterDesigner`, `VideoDirector`) designed for tool-call orchestration.
	- Evaluation Notes: Covers reasoning (Gemini JSON storyboard spec), planning (scene/character tables that feed downstream steps), and execution (queued video renders with serialized HF jobs).

	## Artifacts for Reviewers

	- Social Media Proof: Replace `<SOCIAL_LINK_HERE>` with your live tweet/thread/LinkedIn post so judges can verify community sharing.
	- Video Recording: Upload a walkthrough of the Gradio agent (screen + narration) and swap `<DEMO_VIDEO_LINK>` with the shareable link.


	## 🚀 Key Features

	* End-to-End Automation: Converts a single sentence idea into a complete short film (approx. 30s-60s runtime).
	* Intelligent Storyboarding: Breaks down concepts into scene-by-scene visual prompts and narrative descriptions.
	* Character Consistency System:
	* Automatically identifies main characters.
	* Generates visual reference sheets (Character Anchors).
	* Allows users to "tag" specific characters in specific scenes to ensure visual consistency in the video generation prompt.
	* Multi-Model Video Generation: Supports multiple state-of-the-art open-source video models via Hugging Face.
	* Robust Fallback System: If the selected video model fails (e.g., server overload), the system automatically tries alternative models until generation succeeds.
	* Interactive Editing:
	* Edit visual prompts manually.
	* Add, Insert, or Delete scenes during production.
	* Regenerate specific clips or character looks.
	* Client-Side Video Merging: Combines individual generated clips into a single continuous movie file directly in the browser without requiring a backend video processing server.


	## 🤖 AI Models & API Usage

	The application orchestrates two primary AI services:

	### 1. Google Gemini API (`@google/genai`)
	Used for the "Brain" and "Art Department" of the application.

	* Logic & Scripting: `gemini-2.5-flash`
	* Role: Analyzes the user's idea, generates the title, creates character profiles, and writes the JSON-structured storyboard with visual prompts.
	* Technique: Uses Structured Output (JSON Schema) to ensure the app can parse the story data reliably.
	* Character Design: `gemini-2.5-flash-image`
	* Role: Generates static reference images for characters based on the script's descriptions.
	* Role: Acts as the visual anchor for the user to verify character appearance before video generation.

	### 2. Hugging Face Inference API (`@huggingface/inference`)
	Used for the "Production/Camera" department.

	* Video Generation Models:
	* Wan 2.1 (Wan-AI): `Wan-AI/Wan2.1-T2V-14B` (Primary/Default)
	* LTX Video (Lightricks): `Lightricks/LTX-Video-0.9.7-distilled`
	* Hunyuan Video 1.5: `tencent/HunyuanVideo-1.5`
	* CogVideoX: `THUDM/CogVideoX-5b`
	* Provider: Defaults to `fal-ai` via Hugging Face Inference for high-performance GPU access.