File size: 8,705 Bytes
a74299f
 
 
 
 
 
 
 
 
 
 
 
 
9c13d14
 
 
daf509a
9c13d14
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
daf509a
9c13d14
 
 
 
 
 
 
 
 
daf509a
 
 
 
 
 
 
9c13d14
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
daf509a
 
 
 
 
 
 
 
 
 
 
9c13d14
 
 
daf509a
9c13d14
 
 
 
 
 
 
 
daf509a
 
 
 
 
 
 
 
 
 
 
 
 
9c13d14
 
daf509a
9c13d14
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
c3783bc
9c13d14
 
c3783bc
9c13d14
 
c3783bc
daf509a
 
9c13d14
 
 
 
 
 
 
daf509a
 
 
 
 
 
9c13d14
 
 
 
 
 
 
 
 
 
 
daf509a
 
9c13d14
daf509a
9c13d14
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
daf509a
9c13d14
 
 
 
 
 
 
 
 
 
daf509a
9c13d14
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
---
license: apache-2.0
library_name: diffusers
tags:
- text-to-video
- image-to-video
- video-generation
- diffusers
pipeline_tag: text-to-video
inference: true
base_model: deathlegionteam/LEGION-Video-Gen
widget:
- text: "A serene mountain lake at sunset with colorful clouds reflecting on the water"
# βš”οΈ LEGION VIDEO GENERATION β€” The Ultimate AI Video Engine

<p align="center">
<strong>State-of-the-art video generation with 8.3B parameters</strong><br>
Text-to-Video Β· Image-to-Video Β· QWatermark System
</p>

<p align="center">
<img src="https://img.shields.io/badge/Params-8.3B-blue" alt="Parameters">
<img src="https://img.shields.io/badge/License-Apache%202.0-green" alt="License">
<img src="https://img.shields.io/badge/GPU-Recommended-red" alt="GPU">
<a href="https://huggingface.co/deathlegionteam/LEGION-Video-Gen"><img src="https://img.shields.io/badge/πŸ€—%20HuggingFace-LEGION--Video--Gen-blue" alt="HuggingFace"></a>
</p>

## πŸ“‹ Table of Contents

- [✨ Features](#-features)
- [πŸš€ Quick Start](#-quick-start)
- [🌐 API Documentation](#-api-documentation)
- [πŸ’§ QWatermark System](#-qwatermark-system)
- [πŸ€— HuggingFace](#-huggingface)
- [πŸ–₯️ Project Structure](#️-project-structure)
- [🎬 Example Prompts](#-example-prompts)
- [πŸ“œ License](#-license)

## ✨ Features

- **🎬 Text-to-Video Generation** β€” Create videos from any text prompt with cinematic quality
- **πŸ–ΌοΈ Image-to-Video Generation** β€” Animate still images with controlled motion
- **πŸ’§ QWatermark System** β€” Configurable semi-transparent quality assurance watermark with position, size, opacity, and text controls
- **🌐 Web Application** β€” Full Gradio UI with dark theme and FastAPI backend
- **πŸ“‘ REST API** β€” Programmatic video generation via HTTP endpoints
- **πŸ›‘οΈ Graceful Fallback** β€” Mock generation mode when no GPU is available

## πŸš€ Quick Start

### Prerequisites

- **GPU (Recommended):** NVIDIA GPU with 16GB+ VRAM (RTX 4090, A100, H100)
- **CPU (Fallback):** Works with mock generation mode (test pattern videos)
- **Python 3.10+**
- **~30GB free disk space** (model weights)

### Installation

```bash
# Clone the repository
git clone https://huggingface.co/deathlegionteam/LEGION-Video-Gen
cd LEGION-Video-Gen

# Create virtual environment
python3 -m venv venv
source venv/bin/activate

# Install dependencies
pip install --upgrade pip
pip install -r requirements.txt

# Verify installation
python3 -c "import torch, diffusers, gradio, fastapi; print('OK')"
```

### Quick Start β€” Generate Your First Video

```python
from inference import LegionVideoGenerator

generator = LegionVideoGenerator()
video_path = generator.generate_from_text(
    prompt="A serene mountain lake at sunset with colorful clouds reflecting on the water, gentle ripples, cinematic quality",
    num_frames=49,
    width=480,
    height=480,
    num_inference_steps=50,
    guidance_scale=6.0,
    watermark_strength=0.3,
)
print(f"Video saved to: {video_path}")
```

### Starting the Web UI

```bash
# Start the API backend
python3 backend/main.py &

# Start the Gradio frontend
python3 frontend/app.py

# Open http://localhost:8080 in your browser
```

## 🌐 API Documentation

### REST API Endpoints

The backend runs on port **8081** by default.

| Method | Endpoint | Description |
|--------|----------|-------------|
| `GET` | `/api/status` | Health check with model and device info |
| `POST` | `/api/generate/text` | Generate video from text prompt |
| `POST` | `/api/generate/image` | Generate video from image + text prompt |
| `GET` | `/` | API root with endpoint listing |

### Text-to-Video Generation

```python
import requests

response = requests.post(
    "http://localhost:8081/api/generate/text",
    json={
        "prompt": "A cyberpunk city street at night with neon lights reflecting on wet pavement",
        "negative_prompt": "warped, distorted, flickering, jittery, low quality, blurry, artifacts",
        "num_frames": 49,
        "width": 480,
        "height": 480,
        "num_inference_steps": 50,
        "guidance_scale": 6.0,
        "watermark_strength": 0.3,
    }
)

with open("output.mp4", "wb") as f:
    f.write(response.content)
```

### Image-to-Video Generation

```python
import requests

with open("input_image.jpg", "rb") as img:
    response = requests.post(
        "http://localhost:8081/api/generate/image",
        files={"file": img},
        data={
            "prompt": "Gentle motion, cinematic camera movement, atmospheric",
            "num_frames": 49,
            "width": 480,
            "height": 480,
            "num_inference_steps": 50,
            "guidance_scale": 6.0,
            "watermark_strength": 0.3,
        }
    )

with open("animated.mp4", "wb") as f:
    f.write(response.content)
```

## πŸ’§ QWatermark System

The QWatermark (Quality Watermark) system imprints a configurable assurance marker on every generated video.

| Parameter | Description | Default |
|-----------|-------------|---------|
| Text | Watermark text | "LEGION" |
| Position | Placement on frame | bottom-right |
| Font Size | Text size | 36 |
| Opacity | Transparency | 0.3 |
| Strength | Overall intensity | 0.0 (disabled) - 1.0 (full) |

## πŸ€— HuggingFace

- **Model Repository**: [deathlegionteam/LEGION-Video-Gen](https://huggingface.co/deathlegionteam/LEGION-Video-Gen)
- **Space (Live Demo)**: [deathlegionteam/LEGION-Video-Gen-Space](https://huggingface.co/spaces/deathlegionteam/LEGION-Video-Gen-Space)

### Model Weights

The model is available as a complete Diffusers pipeline on HuggingFace Hub. You can load it directly using the Diffusers library:

```python
from diffusers import DiffusionPipeline
import torch

pipe = DiffusionPipeline.from_pretrained(
    "deathlegionteam/LEGION-Video-Gen",
    torch_dtype=torch.float16,
)
pipe = pipe.to("cuda")
pipe.vae.enable_tiling()
pipe.enable_attention_slicing()

# Generate video
video_frames = pipe(
    prompt="A serene mountain lake at sunset",
    num_frames=49,
    width=480,
    height=480,
    num_inference_steps=50,
    guidance_scale=6.0,
).frames[0]
```

## πŸ–₯️ Project Structure

```
/app/video_generation_pipeline_1006/
β”œβ”€β”€ inference.py           # Core generation class (LegionVideoGenerator)
β”œβ”€β”€ backend/
β”‚   └── main.py            # FastAPI backend (port 8081)
β”œβ”€β”€ frontend/
β”‚   β”œβ”€β”€ app.py             # Gradio frontend (port 8080)
β”‚   └── streamlit_app.py   # Streamlit frontend
β”œβ”€β”€ models/
β”‚   β”œβ”€β”€ t2v/               # T2V model weights (safetensor format)
β”‚   └── i2v/               # I2V model directory
β”œβ”€β”€ outputs/               # Generated videos
β”œβ”€β”€ requirements.txt       # Python dependencies
β”œβ”€β”€ README.md              # This file
└── .space/                # HuggingFace Space configuration
```

## 🎬 Example Prompts

### Text-to-Video

| Prompt | Style |
|--------|-------|
| "A serene mountain lake at sunset with colorful clouds reflecting on the water, gentle ripples, cinematic quality" | Nature |
| "A cyberpunk city street at night with neon lights reflecting on wet pavement, flying cars, cinematic, dramatic lighting" | Sci-Fi |
| "A majestic eagle soaring through misty mountain peaks, golden hour lighting, slow motion, National Geographic quality" | Wildlife |
| "An astronaut floating in space with Earth in the background, stars twinkling, cinematic, hyperrealistic" | Space |
| "A cozy medieval tavern interior with fireplace, warm lighting, people chatting, fantasy RPG aesthetic" | Fantasy |

### Image-to-Video

| Prompt | Motion Effect |
|--------|---------------|
| "Gentle motion, cinematic camera pan, atmospheric" | Camera movement |
| "Flowing water, leaves rustling in the wind, peaceful" | Nature animation |
| "Slow zoom in, dramatic reveal, cinematic lighting" | Zoom effect |
| "Character breathing gently, subtle movement, portrait" | Portrait animation |

## πŸ“Š Performance

| Hardware | Resolution | Frames | Steps | Time |
|----------|------------|--------|-------|------|
| RTX 4090 (24GB) | 480p | 49 | 50 | ~2-3 min |
| A100 (80GB) | 480p | 49 | 50 | ~1-2 min |
| CPU (16+ cores) | N/A | Mock | β€” | ~20-30 sec |

## πŸ“ Notes

- **GPU Required for Real Inference:** The 8.3B parameter model requires ~16GB VRAM for FP16 inference. Without a GPU, the system runs in mock mode.
- **Disk Space:** Full model weights (T2V) are approximately 13GB. Additional I2V variant would add another ~13GB.

## πŸ“œ License

This project is licensed under **Apache 2.0**.

<p align="center">
<strong>βš”οΈ LEGION VIDEO GENERATION</strong><br>
Built with ❀️ for the open-source AI community
</p>