Spaces:

mgbam
/

CingenAI

Sleeping

App Files Files Community

CingenAI / README.md

mgbam

Update README.md

17602ca verified 10 months ago

preview code

raw

history blame contribute delete

9.96 kB

	---
	title: CineeeeAi
	emoji: 🚀
	colorFrom: red
	colorTo: red
	sdk: docker
	app_port: 8501
	tags:
	- streamlit
	pinned: false
	short_description: Streamlit template space
	---

	# CineGen AI Ultra+ 🎬✨

	Visionary Cinematic Pre-Production Powered by AI

	CineGen AI Ultra+ is a Streamlit web application designed to accelerate the creative pre-production process for films, animations, and other visual storytelling projects. It leverages the power of Large Language Models (Google's Gemini) and other AI tools to transform a simple story idea into a rich cinematic treatment, complete with scene breakdowns, visual style suggestions, AI-generated concept art/video clips, and a narrated animatic.

	## Features

	* AI Creative Director: Input a core story idea, genre, and mood.
	* Cinematic Treatment Generation:
	* Gemini generates a detailed multi-scene treatment.
	* Each scene includes:
	* Title, Emotional Beat, Setting Description
	* Characters Involved, Character Focus Moment
	* Key Plot Beat, Suggested Dialogue Hook
	* Proactive Director's Suggestions (감독 - Gamdok/Director): Visual Style, Camera Work, Sound Design.
	* Asset Generation Aids: Suggested Asset Type (Image/Video Clip), Video Motion Description & Duration, Image/Video Keywords, Pexels Search Queries.
	* Visual Asset Generation:
	* Image Generation: Utilizes DALL-E 3 (via OpenAI API) to generate concept art for scenes based on derived prompts.
	* Stock Footage Fallback: Uses Pexels API for relevant stock images/videos if AI generation is disabled or fails.
	* Video Clip Generation (Placeholder): Integrated structure for text-to-video/image-to-video generation using RunwayML API (requires user to implement actual API calls in `core/visual_engine.py`). Placeholder generates dummy video clips.
	* Character Definition: Define key characters with visual descriptions for more consistent AI-generated visuals.
	* Global Style Overrides: Apply global stylistic keywords (e.g., "Hyper-Realistic Gritty Noir," "Vintage Analog Sci-Fi") to influence visual generation.
	* AI-Powered Narration:
	* Gemini crafts a narration script based on the generated treatment.
	* ElevenLabs API synthesizes the narration into natural-sounding audio.
	* Customizable voice ID and narration style.
	* Iterative Refinement:
	* Edit scene treatments and regenerate them with AI assistance.
	* Refine DALL-E prompts based on feedback and regenerate visuals.
	* Cinematic Animatic Assembly:
	* Combines generated visual assets (images/video clips) and the synthesized narration into a downloadable `.mp4` animatic.
	* Customizable per-scene duration for pacing control.
	* Ken Burns effect on still images and text overlays for scene context.
	* Secrets Management: Securely loads API keys from Streamlit secrets or environment variables.

	## Project Structure
	Use code with caution.
	Markdown
	CineGenAI/
	├── .streamlit/
	│ └── secrets.toml # API Keys and configuration (DO NOT COMMIT if public)
	├── assets/
	│ └── fonts/
	│ └── arial.ttf # Example font file (ensure it's available or update path)
	├── core/
	│ ├── init.py
	│ ├── gemini_handler.py # Manages interactions with the Gemini API
	│ ├── visual_engine.py # Handles image/video generation (DALL-E, Pexels, RunwayML placeholder) and video assembly
	│ └── prompt_engineering.py # Contains functions to craft detailed prompts for Gemini
	├── temp_cinegen_media/ # Temporary directory for generated media (add to .gitignore)
	├── app.py # Main Streamlit application script
	├── Dockerfile # For containerizing the application
	├── Dockerfile.test # (Optional) For testing
	├── requirements.txt # Python dependencies
	├── README.md # This file
	└── .gitattributes # For Git LFS if handling large font files
	## Setup and Installation

	1. Clone the Repository:
	```bash
	git clone <repository_url>
	cd CineGenAI
	```

	2. Create a Virtual Environment (Recommended):
	```bash
	python -m venv venv
	source venv/bin/activate # On Windows: venv\Scripts\activate
	```

	3. Install Dependencies:
	```bash
	pip install -r requirements.txt
	```
	Note: `MoviePy` might require `ffmpeg` to be installed on your system. On Debian/Ubuntu: `sudo apt-get install ffmpeg`. On macOS with Homebrew: `brew install ffmpeg`.

	4. Set Up API Keys:
	You need API keys for the following services:
	* Google Gemini API
	* OpenAI API (for DALL-E)
	* ElevenLabs API (and optionally a specific Voice ID)
	* Pexels API
	* RunwayML API (if implementing full video generation)

	Store these keys securely. You have two primary options:

	* Streamlit Secrets (Recommended for Hugging Face Spaces / Streamlit Cloud):
	Create a file `.streamlit/secrets.toml` (make sure this file is in your `.gitignore` if your repository is public!) with the following format:
	```toml
	GEMINI_API_KEY = "your_gemini_api_key"
	OPENAI_API_KEY = "your_openai_api_key"
	ELEVENLABS_API_KEY = "your_elevenlabs_api_key"
	PEXELS_API_KEY = "your_pexels_api_key"
	ELEVENLABS_VOICE_ID = "your_elevenlabs_voice_id" # e.g., "Rachel" or a custom ID
	RUNWAY_API_KEY = "your_runwayml_api_key"
	```

	* Environment Variables (for local development):
	Set the environment variables directly in your terminal or `.env` file (using a library like `python-dotenv` which is not included by default). The application will look for these if Streamlit secrets are not found.
	```bash
	export GEMINI_API_KEY="your_gemini_api_key"
	export OPENAI_API_KEY="your_openai_api_key"
	# ... and so on for other keys
	```

	5. Font:
	Ensure the font file specified in `core/visual_engine.py` (e.g., `arial.ttf`) is accessible. The script tries common system paths, but you can place it in `assets/fonts/` and adjust the path in `visual_engine.py` if needed. If using Docker, ensure the font is copied into the image (see `Dockerfile`).

	6. RunwayML Implementation (Important):
	The current integration for RunwayML in `core/visual_engine.py` (method `_generate_video_clip_with_runwayml`) is a placeholder. You will need to:
	* Install the official RunwayML SDK if available.
	* Implement the actual API calls to RunwayML for text-to-video or image-to-video generation.
	* The placeholder currently creates a dummy video clip using MoviePy.

	## Running the Application

	1. Local Development:
	```bash
	streamlit run app.py
	```
	The application should open in your web browser.

	2. Using Docker (Optional):
	* Build the Docker image:
	```bash
	docker build -t cinegen-ai .
	```
	* Run the Docker container (ensure API keys are passed as environment variables or handled via mounted secrets if not baked into the image for local testing):
	```bash
	docker run -p 8501:8501 \
	-e GEMINI_API_KEY="your_key" \
	-e OPENAI_API_KEY="your_key" \
	# ... other env vars ...
	cinegen-ai
	```
	Access the app at `http://localhost:8501`.

	## How to Use

	1. Input Creative Seed: Provide your core story idea, select a genre, mood, number of scenes, and AI director style in the sidebar.
	2. Generate Treatment: Click "🌌 Generate Cinematic Treatment". The AI will produce a multi-scene breakdown.
	3. Review & Refine:
	* Examine each scene's details, including AI-generated visuals (or placeholders).
	* Use the "✏️ Edit Scene X Treatment" and "🎨 Edit Scene X Visual Prompt" popovers to provide feedback and regenerate specific parts of the treatment or visuals.
	* Adjust per-scene "Dominant Shot Type" and "Scene Duration" for the animatic.
	4. Fine-Tuning (Sidebar):
	* Define characters with visual descriptions.
	* Apply global style overrides.
	* Set narrator voice ID and narration style.
	5. Assemble Animatic: Once you're satisfied with the treatment, visuals, and narration script (generated automatically), click "🎬 Assemble Narrated Cinematic Animatic".
	6. View & Download: The generated animatic video will appear, and you can download it.

	## Key Technologies

	* Python
	* Streamlit: Web application framework.
	* Google Gemini API: For core text generation (treatment, narration script, prompt refinement).
	* OpenAI API (DALL-E 3): For AI image generation.
	* ElevenLabs API: For text-to-speech narration.
	* Pexels API: For stock image/video fallbacks.
	* RunwayML API (Placeholder): For AI video clip generation.
	* MoviePy: For video processing and animatic assembly.
	* Pillow (PIL): For image manipulation.

	## Future Enhancements / To-Do

	* Implement full, robust RunwayML API integration.
	* Option to upload custom seed images for image-to-video generation.
	* More sophisticated control over Ken Burns effect (pan direction, zoom intensity).
	* Allow users to upload their own audio for narration or background music.
	* Advanced shot list generation and export.
	* Integration with other AI video/image models.
	* User accounts and project saving.
	* More granular error handling and user feedback in the UI.
	* Refine JSON cleaning from Gemini to be even more robust.

	## Contributing

	Contributions, bug reports, and feature requests are welcome! Please open an issue or submit a pull request.

	## License

	This project is currently under [Specify License Here - e.g., MIT, Apache 2.0, or state as private/proprietary].

	---

	This README provides a comprehensive overview. Ensure all paths, commands, and API key instructions match your specific project setup.