Instructions to use deathlegionteam/LEGION-Video-Gen with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Diffusers
How to use deathlegionteam/LEGION-Video-Gen with Diffusers:
pip install -U diffusers transformers accelerate
import torch from diffusers import DiffusionPipeline # switch to "mps" for apple devices pipe = DiffusionPipeline.from_pretrained("deathlegionteam/LEGION-Video-Gen", dtype=torch.bfloat16, device_map="cuda") prompt = "Astronaut in a jungle, cold color palette, muted colors, detailed, 8k" image = pipe(prompt).images[0] - Notebooks
- Google Colab
- Kaggle
YAML Metadata Warning:empty or missing yaml metadata in repo card
Check out the documentation for more information.
license: apache-2.0 library_name: diffusers tags:
- text-to-video
- image-to-video
- video-generation
- diffusers pipeline_tag: text-to-video inference: true base_model: deathlegionteam/LEGION-Video-Gen widget:
- text: "A serene mountain lake at sunset with colorful clouds reflecting on the water"
βοΈ LEGION VIDEO GENERATION β The Ultimate AI Video Engine
State-of-the-art video generation with 8.3B parameters
Text-to-Video Β· Image-to-Video Β· QWatermark System
π Table of Contents
- β¨ Features
- π Quick Start
- π API Documentation
- π§ QWatermark System
- π€ HuggingFace
- π₯οΈ Project Structure
- π¬ Example Prompts
- π License
β¨ Features
- π¬ Text-to-Video Generation β Create videos from any text prompt with cinematic quality
- πΌοΈ Image-to-Video Generation β Animate still images with controlled motion
- π§ QWatermark System β Configurable semi-transparent quality assurance watermark with position, size, opacity, and text controls
- π Web Application β Full Gradio UI with dark theme and FastAPI backend
- π‘ REST API β Programmatic video generation via HTTP endpoints
- π‘οΈ Graceful Fallback β Mock generation mode when no GPU is available
π Quick Start
Prerequisites
- GPU (Recommended): NVIDIA GPU with 16GB+ VRAM (RTX 4090, A100, H100)
- CPU (Fallback): Works with mock generation mode (test pattern videos)
- Python 3.10+
- ~30GB free disk space (model weights)
Installation
# Clone the repository
git clone https://huggingface.co/deathlegionteam/LEGION-Video-Gen
cd LEGION-Video-Gen
# Create virtual environment
python3 -m venv venv
source venv/bin/activate
# Install dependencies
pip install --upgrade pip
pip install -r requirements.txt
# Verify installation
python3 -c "import torch, diffusers, gradio, fastapi; print('OK')"
Quick Start β Generate Your First Video
from inference import LegionVideoGenerator
generator = LegionVideoGenerator()
video_path = generator.generate_from_text(
prompt="A serene mountain lake at sunset with colorful clouds reflecting on the water, gentle ripples, cinematic quality",
num_frames=49,
width=480,
height=480,
num_inference_steps=50,
guidance_scale=6.0,
watermark_strength=0.3,
)
print(f"Video saved to: {video_path}")
Starting the Web UI
# Start the API backend
python3 backend/main.py &
# Start the Gradio frontend
python3 frontend/app.py
# Open http://localhost:8080 in your browser
π API Documentation
REST API Endpoints
The backend runs on port 8081 by default.
| Method | Endpoint | Description |
|---|---|---|
GET |
/api/status |
Health check with model and device info |
POST |
/api/generate/text |
Generate video from text prompt |
POST |
/api/generate/image |
Generate video from image + text prompt |
GET |
/ |
API root with endpoint listing |
Text-to-Video Generation
import requests
response = requests.post(
"http://localhost:8081/api/generate/text",
json={
"prompt": "A cyberpunk city street at night with neon lights reflecting on wet pavement",
"negative_prompt": "warped, distorted, flickering, jittery, low quality, blurry, artifacts",
"num_frames": 49,
"width": 480,
"height": 480,
"num_inference_steps": 50,
"guidance_scale": 6.0,
"watermark_strength": 0.3,
}
)
with open("output.mp4", "wb") as f:
f.write(response.content)
Image-to-Video Generation
import requests
with open("input_image.jpg", "rb") as img:
response = requests.post(
"http://localhost:8081/api/generate/image",
files={"file": img},
data={
"prompt": "Gentle motion, cinematic camera movement, atmospheric",
"num_frames": 49,
"width": 480,
"height": 480,
"num_inference_steps": 50,
"guidance_scale": 6.0,
"watermark_strength": 0.3,
}
)
with open("animated.mp4", "wb") as f:
f.write(response.content)
π§ QWatermark System
The QWatermark (Quality Watermark) system imprints a configurable assurance marker on every generated video.
| Parameter | Description | Default |
|---|---|---|
| Text | Watermark text | "LEGION" |
| Position | Placement on frame | bottom-right |
| Font Size | Text size | 36 |
| Opacity | Transparency | 0.3 |
| Strength | Overall intensity | 0.0 (disabled) - 1.0 (full) |
π€ HuggingFace
- Model Repository: deathlegionteam/LEGION-Video-Gen
- Space (Live Demo): deathlegionteam/LEGION-Video-Gen-Space
Model Weights
The model is available as a complete Diffusers pipeline on HuggingFace Hub. You can load it directly using the Diffusers library:
from diffusers import DiffusionPipeline
import torch
pipe = DiffusionPipeline.from_pretrained(
"deathlegionteam/LEGION-Video-Gen",
torch_dtype=torch.float16,
)
pipe = pipe.to("cuda")
pipe.vae.enable_tiling()
pipe.enable_attention_slicing()
# Generate video
video_frames = pipe(
prompt="A serene mountain lake at sunset",
num_frames=49,
width=480,
height=480,
num_inference_steps=50,
guidance_scale=6.0,
).frames[0]
π₯οΈ Project Structure
/app/video_generation_pipeline_1006/
βββ inference.py # Core generation class (LegionVideoGenerator)
βββ backend/
β βββ main.py # FastAPI backend (port 8081)
βββ frontend/
β βββ app.py # Gradio frontend (port 8080)
β βββ streamlit_app.py # Streamlit frontend
βββ models/
β βββ t2v/ # T2V model weights (safetensor format)
β βββ i2v/ # I2V model directory
βββ outputs/ # Generated videos
βββ requirements.txt # Python dependencies
βββ README.md # This file
βββ .space/ # HuggingFace Space configuration
π¬ Example Prompts
Text-to-Video
| Prompt | Style |
|---|---|
| "A serene mountain lake at sunset with colorful clouds reflecting on the water, gentle ripples, cinematic quality" | Nature |
| "A cyberpunk city street at night with neon lights reflecting on wet pavement, flying cars, cinematic, dramatic lighting" | Sci-Fi |
| "A majestic eagle soaring through misty mountain peaks, golden hour lighting, slow motion, National Geographic quality" | Wildlife |
| "An astronaut floating in space with Earth in the background, stars twinkling, cinematic, hyperrealistic" | Space |
| "A cozy medieval tavern interior with fireplace, warm lighting, people chatting, fantasy RPG aesthetic" | Fantasy |
Image-to-Video
| Prompt | Motion Effect |
|---|---|
| "Gentle motion, cinematic camera pan, atmospheric" | Camera movement |
| "Flowing water, leaves rustling in the wind, peaceful" | Nature animation |
| "Slow zoom in, dramatic reveal, cinematic lighting" | Zoom effect |
| "Character breathing gently, subtle movement, portrait" | Portrait animation |
π Performance
| Hardware | Resolution | Frames | Steps | Time |
|---|---|---|---|---|
| RTX 4090 (24GB) | 480p | 49 | 50 | ~2-3 min |
| A100 (80GB) | 480p | 49 | 50 | ~1-2 min |
| CPU (16+ cores) | N/A | Mock | β | ~20-30 sec |
π Notes
- GPU Required for Real Inference: The 8.3B parameter model requires ~16GB VRAM for FP16 inference. Without a GPU, the system runs in mock mode.
- Disk Space: Full model weights (T2V) are approximately 13GB. Additional I2V variant would add another ~13GB.
π License
This project is licensed under Apache 2.0.
βοΈ LEGION VIDEO GENERATION
Built with β€οΈ for the open-source AI community
- Downloads last month
- -
Inference Providers NEW
This model isn't deployed by any Inference Provider. π Ask for provider support