Instructions to use deathlegionteam/LEGION-Video-Gen with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Diffusers
How to use deathlegionteam/LEGION-Video-Gen with Diffusers:
pip install -U diffusers transformers accelerate
import torch from diffusers import DiffusionPipeline # switch to "mps" for apple devices pipe = DiffusionPipeline.from_pretrained("deathlegionteam/LEGION-Video-Gen", dtype=torch.bfloat16, device_map="cuda") prompt = "Astronaut in a jungle, cold color palette, muted colors, detailed, 8k" image = pipe(prompt).images[0] - Notebooks
- Google Colab
- Kaggle
license: apache-2.0 library_name: diffusers tags:
- text-to-video
- image-to-video
- video-generation
- diffusers pipeline_tag: text-to-video base_model: deathlegionteam/LEGION-Video-Gen inference: true widget:
- text: "A serene mountain lake at sunset with colorful clouds reflecting on the water"
βοΈ LEGION Video Generation
State-of-the-art video generation with 8.3B parameters
Model Description
LEGION Video Generation is a production-ready video generation system with 8.3 billion parameters. It supports text-to-video (T2V) and image-to-video (I2V) generation with temporal enhancement and configurable QWatermark system.
Architecture
| Component | Description |
|---|---|
| Transformer | 54-layer 3D Diffusion Transformer |
| VAE | 3D causal VAE with 32-channel latent space |
| Text Encoder | Qwen2.5-VL (7B) + T5 Encoder ensemble for rich text understanding |
| Scheduler | Flow Matching Euler Discrete Scheduler with shifting |
| Parameters | 8.3 Billion total |
| Precision | FP16 (inference) |
Key Features
- π¬ Text-to-Video β Generate cinematic videos from any text prompt
- πΌοΈ Image-to-Video β Animate still images with controlled motion
- π§ QWatermark System β Configurable quality assurance watermark overlay
- π Web UI β Gradio frontend with dark theme and FastAPI backend
- π‘ REST API β Programmatic access via HTTP endpoints
Intended Use
Direct Use
- Video Generation: Create high-quality videos from text descriptions
- Content Creation: Generate video assets for social media, marketing, and creative projects
- Animation: Animate still images with natural motion
- Prototyping: Rapid video ideation for film, game, and design workflows
Out-of-Scope Use
- Generating deceptive or misleading video content (deepfakes)
- Creating violent, hateful, or otherwise harmful content
- Misrepresenting generated content as authentic footage
- Bypassing content safety systems
How to Get Started
Installation
# Clone the repository
git clone https://huggingface.co/deathlegionteam/LEGION-Video-Gen
cd LEGION-Video-Gen
# Install dependencies
pip install -r requirements.txt
Basic Inference
from diffusers import DiffusionPipeline
import torch
# Load the pipeline
pipe = DiffusionPipeline.from_pretrained(
"deathlegionteam/LEGION-Video-Gen",
torch_dtype=torch.float16,
)
pipe = pipe.to("cuda")
# Enable memory optimizations
pipe.vae.enable_tiling()
pipe.enable_attention_slicing()
pipe.enable_model_cpu_offload()
# Generate a video
video_frames = pipe(
prompt="A serene mountain lake at sunset with colorful clouds reflecting on the water, cinematic quality",
negative_prompt="warped, distorted, flickering, jittery, low quality, blurry, artifacts",
num_frames=49,
width=480,
height=480,
num_inference_steps=50,
guidance_scale=6.0,
).frames[0]
# Save as MP4
import imageio
imageio.mimsave("output.mp4", video_frames, fps=16, codec="libx264")
Using the LEGION Generator Wrapper
from inference import LegionVideoGenerator
generator = LegionVideoGenerator()
video_path = generator.generate_from_text(
prompt="A cyberpunk city street at night with neon lights reflecting on wet pavement",
num_frames=49,
width=480,
height=480,
num_inference_steps=50,
guidance_scale=6.0,
watermark_strength=0.3,
)
print(f"Video saved to: {video_path}")
QWatermark System
The QWatermark (Quality Watermark) system imprints a configurable assurance marker:
| Parameter | Description | Default |
|---|---|---|
| Text | Watermark text | "LEGION" |
| Position | Placement on frame | bottom-right |
| Font Size | Text size | 36 |
| Opacity | Transparency | 0.3 |
| Strength | Overall intensity | 0.0 (disabled) - 1.0 (full) |
Limitations
- GPU Required: Real inference requires 16GB+ VRAM GPU. CPU fallback provides mock/test patterns only.
- Resolution: Optimized for 480p. Higher resolutions (720p+) require more VRAM.
- Video Length: Generates up to 129 frames (~8 seconds at 16 FPS).
- Content Quality: Results vary with prompt quality; complex scenes may show artifacts.
License
This model is released under Apache 2.0 License.
Contact
- Organization: deathlegionteam/LEGION-Video-Gen
βοΈ LEGION VIDEO GENERATION