Instructions to use deathlegionteam/LEGION-Video-Gen with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Diffusers
How to use deathlegionteam/LEGION-Video-Gen with Diffusers:
pip install -U diffusers transformers accelerate
import torch from diffusers import DiffusionPipeline # switch to "mps" for apple devices pipe = DiffusionPipeline.from_pretrained("deathlegionteam/LEGION-Video-Gen", dtype=torch.bfloat16, device_map="cuda") prompt = "Astronaut in a jungle, cold color palette, muted colors, detailed, 8k" image = pipe(prompt).images[0] - Notebooks
- Google Colab
- Kaggle
File size: 8,705 Bytes
a74299f 9c13d14 daf509a 9c13d14 daf509a 9c13d14 daf509a 9c13d14 daf509a 9c13d14 daf509a 9c13d14 daf509a 9c13d14 daf509a 9c13d14 c3783bc 9c13d14 c3783bc 9c13d14 c3783bc daf509a 9c13d14 daf509a 9c13d14 daf509a 9c13d14 daf509a 9c13d14 daf509a 9c13d14 daf509a 9c13d14 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 | ---
license: apache-2.0
library_name: diffusers
tags:
- text-to-video
- image-to-video
- video-generation
- diffusers
pipeline_tag: text-to-video
inference: true
base_model: deathlegionteam/LEGION-Video-Gen
widget:
- text: "A serene mountain lake at sunset with colorful clouds reflecting on the water"
# βοΈ LEGION VIDEO GENERATION β The Ultimate AI Video Engine
<p align="center">
<strong>State-of-the-art video generation with 8.3B parameters</strong><br>
Text-to-Video Β· Image-to-Video Β· QWatermark System
</p>
<p align="center">
<img src="https://img.shields.io/badge/Params-8.3B-blue" alt="Parameters">
<img src="https://img.shields.io/badge/License-Apache%202.0-green" alt="License">
<img src="https://img.shields.io/badge/GPU-Recommended-red" alt="GPU">
<a href="https://huggingface.co/deathlegionteam/LEGION-Video-Gen"><img src="https://img.shields.io/badge/π€%20HuggingFace-LEGION--Video--Gen-blue" alt="HuggingFace"></a>
</p>
## π Table of Contents
- [β¨ Features](#-features)
- [π Quick Start](#-quick-start)
- [π API Documentation](#-api-documentation)
- [π§ QWatermark System](#-qwatermark-system)
- [π€ HuggingFace](#-huggingface)
- [π₯οΈ Project Structure](#οΈ-project-structure)
- [π¬ Example Prompts](#-example-prompts)
- [π License](#-license)
## β¨ Features
- **π¬ Text-to-Video Generation** β Create videos from any text prompt with cinematic quality
- **πΌοΈ Image-to-Video Generation** β Animate still images with controlled motion
- **π§ QWatermark System** β Configurable semi-transparent quality assurance watermark with position, size, opacity, and text controls
- **π Web Application** β Full Gradio UI with dark theme and FastAPI backend
- **π‘ REST API** β Programmatic video generation via HTTP endpoints
- **π‘οΈ Graceful Fallback** β Mock generation mode when no GPU is available
## π Quick Start
### Prerequisites
- **GPU (Recommended):** NVIDIA GPU with 16GB+ VRAM (RTX 4090, A100, H100)
- **CPU (Fallback):** Works with mock generation mode (test pattern videos)
- **Python 3.10+**
- **~30GB free disk space** (model weights)
### Installation
```bash
# Clone the repository
git clone https://huggingface.co/deathlegionteam/LEGION-Video-Gen
cd LEGION-Video-Gen
# Create virtual environment
python3 -m venv venv
source venv/bin/activate
# Install dependencies
pip install --upgrade pip
pip install -r requirements.txt
# Verify installation
python3 -c "import torch, diffusers, gradio, fastapi; print('OK')"
```
### Quick Start β Generate Your First Video
```python
from inference import LegionVideoGenerator
generator = LegionVideoGenerator()
video_path = generator.generate_from_text(
prompt="A serene mountain lake at sunset with colorful clouds reflecting on the water, gentle ripples, cinematic quality",
num_frames=49,
width=480,
height=480,
num_inference_steps=50,
guidance_scale=6.0,
watermark_strength=0.3,
)
print(f"Video saved to: {video_path}")
```
### Starting the Web UI
```bash
# Start the API backend
python3 backend/main.py &
# Start the Gradio frontend
python3 frontend/app.py
# Open http://localhost:8080 in your browser
```
## π API Documentation
### REST API Endpoints
The backend runs on port **8081** by default.
| Method | Endpoint | Description |
|--------|----------|-------------|
| `GET` | `/api/status` | Health check with model and device info |
| `POST` | `/api/generate/text` | Generate video from text prompt |
| `POST` | `/api/generate/image` | Generate video from image + text prompt |
| `GET` | `/` | API root with endpoint listing |
### Text-to-Video Generation
```python
import requests
response = requests.post(
"http://localhost:8081/api/generate/text",
json={
"prompt": "A cyberpunk city street at night with neon lights reflecting on wet pavement",
"negative_prompt": "warped, distorted, flickering, jittery, low quality, blurry, artifacts",
"num_frames": 49,
"width": 480,
"height": 480,
"num_inference_steps": 50,
"guidance_scale": 6.0,
"watermark_strength": 0.3,
}
)
with open("output.mp4", "wb") as f:
f.write(response.content)
```
### Image-to-Video Generation
```python
import requests
with open("input_image.jpg", "rb") as img:
response = requests.post(
"http://localhost:8081/api/generate/image",
files={"file": img},
data={
"prompt": "Gentle motion, cinematic camera movement, atmospheric",
"num_frames": 49,
"width": 480,
"height": 480,
"num_inference_steps": 50,
"guidance_scale": 6.0,
"watermark_strength": 0.3,
}
)
with open("animated.mp4", "wb") as f:
f.write(response.content)
```
## π§ QWatermark System
The QWatermark (Quality Watermark) system imprints a configurable assurance marker on every generated video.
| Parameter | Description | Default |
|-----------|-------------|---------|
| Text | Watermark text | "LEGION" |
| Position | Placement on frame | bottom-right |
| Font Size | Text size | 36 |
| Opacity | Transparency | 0.3 |
| Strength | Overall intensity | 0.0 (disabled) - 1.0 (full) |
## π€ HuggingFace
- **Model Repository**: [deathlegionteam/LEGION-Video-Gen](https://huggingface.co/deathlegionteam/LEGION-Video-Gen)
- **Space (Live Demo)**: [deathlegionteam/LEGION-Video-Gen-Space](https://huggingface.co/spaces/deathlegionteam/LEGION-Video-Gen-Space)
### Model Weights
The model is available as a complete Diffusers pipeline on HuggingFace Hub. You can load it directly using the Diffusers library:
```python
from diffusers import DiffusionPipeline
import torch
pipe = DiffusionPipeline.from_pretrained(
"deathlegionteam/LEGION-Video-Gen",
torch_dtype=torch.float16,
)
pipe = pipe.to("cuda")
pipe.vae.enable_tiling()
pipe.enable_attention_slicing()
# Generate video
video_frames = pipe(
prompt="A serene mountain lake at sunset",
num_frames=49,
width=480,
height=480,
num_inference_steps=50,
guidance_scale=6.0,
).frames[0]
```
## π₯οΈ Project Structure
```
/app/video_generation_pipeline_1006/
βββ inference.py # Core generation class (LegionVideoGenerator)
βββ backend/
β βββ main.py # FastAPI backend (port 8081)
βββ frontend/
β βββ app.py # Gradio frontend (port 8080)
β βββ streamlit_app.py # Streamlit frontend
βββ models/
β βββ t2v/ # T2V model weights (safetensor format)
β βββ i2v/ # I2V model directory
βββ outputs/ # Generated videos
βββ requirements.txt # Python dependencies
βββ README.md # This file
βββ .space/ # HuggingFace Space configuration
```
## π¬ Example Prompts
### Text-to-Video
| Prompt | Style |
|--------|-------|
| "A serene mountain lake at sunset with colorful clouds reflecting on the water, gentle ripples, cinematic quality" | Nature |
| "A cyberpunk city street at night with neon lights reflecting on wet pavement, flying cars, cinematic, dramatic lighting" | Sci-Fi |
| "A majestic eagle soaring through misty mountain peaks, golden hour lighting, slow motion, National Geographic quality" | Wildlife |
| "An astronaut floating in space with Earth in the background, stars twinkling, cinematic, hyperrealistic" | Space |
| "A cozy medieval tavern interior with fireplace, warm lighting, people chatting, fantasy RPG aesthetic" | Fantasy |
### Image-to-Video
| Prompt | Motion Effect |
|--------|---------------|
| "Gentle motion, cinematic camera pan, atmospheric" | Camera movement |
| "Flowing water, leaves rustling in the wind, peaceful" | Nature animation |
| "Slow zoom in, dramatic reveal, cinematic lighting" | Zoom effect |
| "Character breathing gently, subtle movement, portrait" | Portrait animation |
## π Performance
| Hardware | Resolution | Frames | Steps | Time |
|----------|------------|--------|-------|------|
| RTX 4090 (24GB) | 480p | 49 | 50 | ~2-3 min |
| A100 (80GB) | 480p | 49 | 50 | ~1-2 min |
| CPU (16+ cores) | N/A | Mock | β | ~20-30 sec |
## π Notes
- **GPU Required for Real Inference:** The 8.3B parameter model requires ~16GB VRAM for FP16 inference. Without a GPU, the system runs in mock mode.
- **Disk Space:** Full model weights (T2V) are approximately 13GB. Additional I2V variant would add another ~13GB.
## π License
This project is licensed under **Apache 2.0**.
<p align="center">
<strong>βοΈ LEGION VIDEO GENERATION</strong><br>
Built with β€οΈ for the open-source AI community
</p> |