Instructions to use deathlegionteam/LEGION-Video-Gen with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Diffusers
How to use deathlegionteam/LEGION-Video-Gen with Diffusers:
pip install -U diffusers transformers accelerate
import torch from diffusers import DiffusionPipeline # switch to "mps" for apple devices pipe = DiffusionPipeline.from_pretrained("deathlegionteam/LEGION-Video-Gen", dtype=torch.bfloat16, device_map="cuda") prompt = "Astronaut in a jungle, cold color palette, muted colors, detailed, 8k" image = pipe(prompt).images[0] - Notebooks
- Google Colab
- Kaggle
| --- | |
| license: apache-2.0 | |
| library_name: diffusers | |
| tags: | |
| - text-to-video | |
| - image-to-video | |
| - video-generation | |
| - diffusers | |
| pipeline_tag: text-to-video | |
| inference: true | |
| base_model: deathlegionteam/LEGION-Video-Gen | |
| widget: | |
| - text: "A serene mountain lake at sunset with colorful clouds reflecting on the water" | |
| # βοΈ LEGION VIDEO GENERATION β The Ultimate AI Video Engine | |
| <p align="center"> | |
| <strong>State-of-the-art video generation with 8.3B parameters</strong><br> | |
| Text-to-Video Β· Image-to-Video Β· QWatermark System | |
| </p> | |
| <p align="center"> | |
| <img src="https://img.shields.io/badge/Params-8.3B-blue" alt="Parameters"> | |
| <img src="https://img.shields.io/badge/License-Apache%202.0-green" alt="License"> | |
| <img src="https://img.shields.io/badge/GPU-Recommended-red" alt="GPU"> | |
| <a href="https://huggingface.co/deathlegionteam/LEGION-Video-Gen"><img src="https://img.shields.io/badge/π€%20HuggingFace-LEGION--Video--Gen-blue" alt="HuggingFace"></a> | |
| </p> | |
| ## π Table of Contents | |
| - [β¨ Features](#-features) | |
| - [π Quick Start](#-quick-start) | |
| - [π API Documentation](#-api-documentation) | |
| - [π§ QWatermark System](#-qwatermark-system) | |
| - [π€ HuggingFace](#-huggingface) | |
| - [π₯οΈ Project Structure](#οΈ-project-structure) | |
| - [π¬ Example Prompts](#-example-prompts) | |
| - [π License](#-license) | |
| ## β¨ Features | |
| - **π¬ Text-to-Video Generation** β Create videos from any text prompt with cinematic quality | |
| - **πΌοΈ Image-to-Video Generation** β Animate still images with controlled motion | |
| - **π§ QWatermark System** β Configurable semi-transparent quality assurance watermark with position, size, opacity, and text controls | |
| - **π Web Application** β Full Gradio UI with dark theme and FastAPI backend | |
| - **π‘ REST API** β Programmatic video generation via HTTP endpoints | |
| - **π‘οΈ Graceful Fallback** β Mock generation mode when no GPU is available | |
| ## π Quick Start | |
| ### Prerequisites | |
| - **GPU (Recommended):** NVIDIA GPU with 16GB+ VRAM (RTX 4090, A100, H100) | |
| - **CPU (Fallback):** Works with mock generation mode (test pattern videos) | |
| - **Python 3.10+** | |
| - **~30GB free disk space** (model weights) | |
| ### Installation | |
| ```bash | |
| # Clone the repository | |
| git clone https://huggingface.co/deathlegionteam/LEGION-Video-Gen | |
| cd LEGION-Video-Gen | |
| # Create virtual environment | |
| python3 -m venv venv | |
| source venv/bin/activate | |
| # Install dependencies | |
| pip install --upgrade pip | |
| pip install -r requirements.txt | |
| # Verify installation | |
| python3 -c "import torch, diffusers, gradio, fastapi; print('OK')" | |
| ``` | |
| ### Quick Start β Generate Your First Video | |
| ```python | |
| from inference import LegionVideoGenerator | |
| generator = LegionVideoGenerator() | |
| video_path = generator.generate_from_text( | |
| prompt="A serene mountain lake at sunset with colorful clouds reflecting on the water, gentle ripples, cinematic quality", | |
| num_frames=49, | |
| width=480, | |
| height=480, | |
| num_inference_steps=50, | |
| guidance_scale=6.0, | |
| watermark_strength=0.3, | |
| ) | |
| print(f"Video saved to: {video_path}") | |
| ``` | |
| ### Starting the Web UI | |
| ```bash | |
| # Start the API backend | |
| python3 backend/main.py & | |
| # Start the Gradio frontend | |
| python3 frontend/app.py | |
| # Open http://localhost:8080 in your browser | |
| ``` | |
| ## π API Documentation | |
| ### REST API Endpoints | |
| The backend runs on port **8081** by default. | |
| | Method | Endpoint | Description | | |
| |--------|----------|-------------| | |
| | `GET` | `/api/status` | Health check with model and device info | | |
| | `POST` | `/api/generate/text` | Generate video from text prompt | | |
| | `POST` | `/api/generate/image` | Generate video from image + text prompt | | |
| | `GET` | `/` | API root with endpoint listing | | |
| ### Text-to-Video Generation | |
| ```python | |
| import requests | |
| response = requests.post( | |
| "http://localhost:8081/api/generate/text", | |
| json={ | |
| "prompt": "A cyberpunk city street at night with neon lights reflecting on wet pavement", | |
| "negative_prompt": "warped, distorted, flickering, jittery, low quality, blurry, artifacts", | |
| "num_frames": 49, | |
| "width": 480, | |
| "height": 480, | |
| "num_inference_steps": 50, | |
| "guidance_scale": 6.0, | |
| "watermark_strength": 0.3, | |
| } | |
| ) | |
| with open("output.mp4", "wb") as f: | |
| f.write(response.content) | |
| ``` | |
| ### Image-to-Video Generation | |
| ```python | |
| import requests | |
| with open("input_image.jpg", "rb") as img: | |
| response = requests.post( | |
| "http://localhost:8081/api/generate/image", | |
| files={"file": img}, | |
| data={ | |
| "prompt": "Gentle motion, cinematic camera movement, atmospheric", | |
| "num_frames": 49, | |
| "width": 480, | |
| "height": 480, | |
| "num_inference_steps": 50, | |
| "guidance_scale": 6.0, | |
| "watermark_strength": 0.3, | |
| } | |
| ) | |
| with open("animated.mp4", "wb") as f: | |
| f.write(response.content) | |
| ``` | |
| ## π§ QWatermark System | |
| The QWatermark (Quality Watermark) system imprints a configurable assurance marker on every generated video. | |
| | Parameter | Description | Default | | |
| |-----------|-------------|---------| | |
| | Text | Watermark text | "LEGION" | | |
| | Position | Placement on frame | bottom-right | | |
| | Font Size | Text size | 36 | | |
| | Opacity | Transparency | 0.3 | | |
| | Strength | Overall intensity | 0.0 (disabled) - 1.0 (full) | | |
| ## π€ HuggingFace | |
| - **Model Repository**: [deathlegionteam/LEGION-Video-Gen](https://huggingface.co/deathlegionteam/LEGION-Video-Gen) | |
| - **Space (Live Demo)**: [deathlegionteam/LEGION-Video-Gen-Space](https://huggingface.co/spaces/deathlegionteam/LEGION-Video-Gen-Space) | |
| ### Model Weights | |
| The model is available as a complete Diffusers pipeline on HuggingFace Hub. You can load it directly using the Diffusers library: | |
| ```python | |
| from diffusers import DiffusionPipeline | |
| import torch | |
| pipe = DiffusionPipeline.from_pretrained( | |
| "deathlegionteam/LEGION-Video-Gen", | |
| torch_dtype=torch.float16, | |
| ) | |
| pipe = pipe.to("cuda") | |
| pipe.vae.enable_tiling() | |
| pipe.enable_attention_slicing() | |
| # Generate video | |
| video_frames = pipe( | |
| prompt="A serene mountain lake at sunset", | |
| num_frames=49, | |
| width=480, | |
| height=480, | |
| num_inference_steps=50, | |
| guidance_scale=6.0, | |
| ).frames[0] | |
| ``` | |
| ## π₯οΈ Project Structure | |
| ``` | |
| /app/video_generation_pipeline_1006/ | |
| βββ inference.py # Core generation class (LegionVideoGenerator) | |
| βββ backend/ | |
| β βββ main.py # FastAPI backend (port 8081) | |
| βββ frontend/ | |
| β βββ app.py # Gradio frontend (port 8080) | |
| β βββ streamlit_app.py # Streamlit frontend | |
| βββ models/ | |
| β βββ t2v/ # T2V model weights (safetensor format) | |
| β βββ i2v/ # I2V model directory | |
| βββ outputs/ # Generated videos | |
| βββ requirements.txt # Python dependencies | |
| βββ README.md # This file | |
| βββ .space/ # HuggingFace Space configuration | |
| ``` | |
| ## π¬ Example Prompts | |
| ### Text-to-Video | |
| | Prompt | Style | | |
| |--------|-------| | |
| | "A serene mountain lake at sunset with colorful clouds reflecting on the water, gentle ripples, cinematic quality" | Nature | | |
| | "A cyberpunk city street at night with neon lights reflecting on wet pavement, flying cars, cinematic, dramatic lighting" | Sci-Fi | | |
| | "A majestic eagle soaring through misty mountain peaks, golden hour lighting, slow motion, National Geographic quality" | Wildlife | | |
| | "An astronaut floating in space with Earth in the background, stars twinkling, cinematic, hyperrealistic" | Space | | |
| | "A cozy medieval tavern interior with fireplace, warm lighting, people chatting, fantasy RPG aesthetic" | Fantasy | | |
| ### Image-to-Video | |
| | Prompt | Motion Effect | | |
| |--------|---------------| | |
| | "Gentle motion, cinematic camera pan, atmospheric" | Camera movement | | |
| | "Flowing water, leaves rustling in the wind, peaceful" | Nature animation | | |
| | "Slow zoom in, dramatic reveal, cinematic lighting" | Zoom effect | | |
| | "Character breathing gently, subtle movement, portrait" | Portrait animation | | |
| ## π Performance | |
| | Hardware | Resolution | Frames | Steps | Time | | |
| |----------|------------|--------|-------|------| | |
| | RTX 4090 (24GB) | 480p | 49 | 50 | ~2-3 min | | |
| | A100 (80GB) | 480p | 49 | 50 | ~1-2 min | | |
| | CPU (16+ cores) | N/A | Mock | β | ~20-30 sec | | |
| ## π Notes | |
| - **GPU Required for Real Inference:** The 8.3B parameter model requires ~16GB VRAM for FP16 inference. Without a GPU, the system runs in mock mode. | |
| - **Disk Space:** Full model weights (T2V) are approximately 13GB. Additional I2V variant would add another ~13GB. | |
| ## π License | |
| This project is licensed under **Apache 2.0**. | |
| <p align="center"> | |
| <strong>βοΈ LEGION VIDEO GENERATION</strong><br> | |
| Built with β€οΈ for the open-source AI community | |
| </p> |