LEGION-Video-Gen / README.md
dineth554's picture
Upload README.md with huggingface_hub
c3783bc verified
---
license: apache-2.0
library_name: diffusers
tags:
- text-to-video
- image-to-video
- video-generation
- diffusers
pipeline_tag: text-to-video
inference: true
base_model: deathlegionteam/LEGION-Video-Gen
widget:
- text: "A serene mountain lake at sunset with colorful clouds reflecting on the water"
# βš”οΈ LEGION VIDEO GENERATION β€” The Ultimate AI Video Engine
<p align="center">
<strong>State-of-the-art video generation with 8.3B parameters</strong><br>
Text-to-Video Β· Image-to-Video Β· QWatermark System
</p>
<p align="center">
<img src="https://img.shields.io/badge/Params-8.3B-blue" alt="Parameters">
<img src="https://img.shields.io/badge/License-Apache%202.0-green" alt="License">
<img src="https://img.shields.io/badge/GPU-Recommended-red" alt="GPU">
<a href="https://huggingface.co/deathlegionteam/LEGION-Video-Gen"><img src="https://img.shields.io/badge/πŸ€—%20HuggingFace-LEGION--Video--Gen-blue" alt="HuggingFace"></a>
</p>
## πŸ“‹ Table of Contents
- [✨ Features](#-features)
- [πŸš€ Quick Start](#-quick-start)
- [🌐 API Documentation](#-api-documentation)
- [πŸ’§ QWatermark System](#-qwatermark-system)
- [πŸ€— HuggingFace](#-huggingface)
- [πŸ–₯️ Project Structure](#️-project-structure)
- [🎬 Example Prompts](#-example-prompts)
- [πŸ“œ License](#-license)
## ✨ Features
- **🎬 Text-to-Video Generation** β€” Create videos from any text prompt with cinematic quality
- **πŸ–ΌοΈ Image-to-Video Generation** β€” Animate still images with controlled motion
- **πŸ’§ QWatermark System** β€” Configurable semi-transparent quality assurance watermark with position, size, opacity, and text controls
- **🌐 Web Application** β€” Full Gradio UI with dark theme and FastAPI backend
- **πŸ“‘ REST API** β€” Programmatic video generation via HTTP endpoints
- **πŸ›‘οΈ Graceful Fallback** β€” Mock generation mode when no GPU is available
## πŸš€ Quick Start
### Prerequisites
- **GPU (Recommended):** NVIDIA GPU with 16GB+ VRAM (RTX 4090, A100, H100)
- **CPU (Fallback):** Works with mock generation mode (test pattern videos)
- **Python 3.10+**
- **~30GB free disk space** (model weights)
### Installation
```bash
# Clone the repository
git clone https://huggingface.co/deathlegionteam/LEGION-Video-Gen
cd LEGION-Video-Gen
# Create virtual environment
python3 -m venv venv
source venv/bin/activate
# Install dependencies
pip install --upgrade pip
pip install -r requirements.txt
# Verify installation
python3 -c "import torch, diffusers, gradio, fastapi; print('OK')"
```
### Quick Start β€” Generate Your First Video
```python
from inference import LegionVideoGenerator
generator = LegionVideoGenerator()
video_path = generator.generate_from_text(
prompt="A serene mountain lake at sunset with colorful clouds reflecting on the water, gentle ripples, cinematic quality",
num_frames=49,
width=480,
height=480,
num_inference_steps=50,
guidance_scale=6.0,
watermark_strength=0.3,
)
print(f"Video saved to: {video_path}")
```
### Starting the Web UI
```bash
# Start the API backend
python3 backend/main.py &
# Start the Gradio frontend
python3 frontend/app.py
# Open http://localhost:8080 in your browser
```
## 🌐 API Documentation
### REST API Endpoints
The backend runs on port **8081** by default.
| Method | Endpoint | Description |
|--------|----------|-------------|
| `GET` | `/api/status` | Health check with model and device info |
| `POST` | `/api/generate/text` | Generate video from text prompt |
| `POST` | `/api/generate/image` | Generate video from image + text prompt |
| `GET` | `/` | API root with endpoint listing |
### Text-to-Video Generation
```python
import requests
response = requests.post(
"http://localhost:8081/api/generate/text",
json={
"prompt": "A cyberpunk city street at night with neon lights reflecting on wet pavement",
"negative_prompt": "warped, distorted, flickering, jittery, low quality, blurry, artifacts",
"num_frames": 49,
"width": 480,
"height": 480,
"num_inference_steps": 50,
"guidance_scale": 6.0,
"watermark_strength": 0.3,
}
)
with open("output.mp4", "wb") as f:
f.write(response.content)
```
### Image-to-Video Generation
```python
import requests
with open("input_image.jpg", "rb") as img:
response = requests.post(
"http://localhost:8081/api/generate/image",
files={"file": img},
data={
"prompt": "Gentle motion, cinematic camera movement, atmospheric",
"num_frames": 49,
"width": 480,
"height": 480,
"num_inference_steps": 50,
"guidance_scale": 6.0,
"watermark_strength": 0.3,
}
)
with open("animated.mp4", "wb") as f:
f.write(response.content)
```
## πŸ’§ QWatermark System
The QWatermark (Quality Watermark) system imprints a configurable assurance marker on every generated video.
| Parameter | Description | Default |
|-----------|-------------|---------|
| Text | Watermark text | "LEGION" |
| Position | Placement on frame | bottom-right |
| Font Size | Text size | 36 |
| Opacity | Transparency | 0.3 |
| Strength | Overall intensity | 0.0 (disabled) - 1.0 (full) |
## πŸ€— HuggingFace
- **Model Repository**: [deathlegionteam/LEGION-Video-Gen](https://huggingface.co/deathlegionteam/LEGION-Video-Gen)
- **Space (Live Demo)**: [deathlegionteam/LEGION-Video-Gen-Space](https://huggingface.co/spaces/deathlegionteam/LEGION-Video-Gen-Space)
### Model Weights
The model is available as a complete Diffusers pipeline on HuggingFace Hub. You can load it directly using the Diffusers library:
```python
from diffusers import DiffusionPipeline
import torch
pipe = DiffusionPipeline.from_pretrained(
"deathlegionteam/LEGION-Video-Gen",
torch_dtype=torch.float16,
)
pipe = pipe.to("cuda")
pipe.vae.enable_tiling()
pipe.enable_attention_slicing()
# Generate video
video_frames = pipe(
prompt="A serene mountain lake at sunset",
num_frames=49,
width=480,
height=480,
num_inference_steps=50,
guidance_scale=6.0,
).frames[0]
```
## πŸ–₯️ Project Structure
```
/app/video_generation_pipeline_1006/
β”œβ”€β”€ inference.py # Core generation class (LegionVideoGenerator)
β”œβ”€β”€ backend/
β”‚ └── main.py # FastAPI backend (port 8081)
β”œβ”€β”€ frontend/
β”‚ β”œβ”€β”€ app.py # Gradio frontend (port 8080)
β”‚ └── streamlit_app.py # Streamlit frontend
β”œβ”€β”€ models/
β”‚ β”œβ”€β”€ t2v/ # T2V model weights (safetensor format)
β”‚ └── i2v/ # I2V model directory
β”œβ”€β”€ outputs/ # Generated videos
β”œβ”€β”€ requirements.txt # Python dependencies
β”œβ”€β”€ README.md # This file
└── .space/ # HuggingFace Space configuration
```
## 🎬 Example Prompts
### Text-to-Video
| Prompt | Style |
|--------|-------|
| "A serene mountain lake at sunset with colorful clouds reflecting on the water, gentle ripples, cinematic quality" | Nature |
| "A cyberpunk city street at night with neon lights reflecting on wet pavement, flying cars, cinematic, dramatic lighting" | Sci-Fi |
| "A majestic eagle soaring through misty mountain peaks, golden hour lighting, slow motion, National Geographic quality" | Wildlife |
| "An astronaut floating in space with Earth in the background, stars twinkling, cinematic, hyperrealistic" | Space |
| "A cozy medieval tavern interior with fireplace, warm lighting, people chatting, fantasy RPG aesthetic" | Fantasy |
### Image-to-Video
| Prompt | Motion Effect |
|--------|---------------|
| "Gentle motion, cinematic camera pan, atmospheric" | Camera movement |
| "Flowing water, leaves rustling in the wind, peaceful" | Nature animation |
| "Slow zoom in, dramatic reveal, cinematic lighting" | Zoom effect |
| "Character breathing gently, subtle movement, portrait" | Portrait animation |
## πŸ“Š Performance
| Hardware | Resolution | Frames | Steps | Time |
|----------|------------|--------|-------|------|
| RTX 4090 (24GB) | 480p | 49 | 50 | ~2-3 min |
| A100 (80GB) | 480p | 49 | 50 | ~1-2 min |
| CPU (16+ cores) | N/A | Mock | β€” | ~20-30 sec |
## πŸ“ Notes
- **GPU Required for Real Inference:** The 8.3B parameter model requires ~16GB VRAM for FP16 inference. Without a GPU, the system runs in mock mode.
- **Disk Space:** Full model weights (T2V) are approximately 13GB. Additional I2V variant would add another ~13GB.
## πŸ“œ License
This project is licensed under **Apache 2.0**.
<p align="center">
<strong>βš”οΈ LEGION VIDEO GENERATION</strong><br>
Built with ❀️ for the open-source AI community
</p>