LEGION-Video-Gen / MODEL_CARD.md
dineth554's picture
Upload MODEL_CARD.md with huggingface_hub
bb01e26 verified
---
license: apache-2.0
library_name: diffusers
tags:
- text-to-video
- image-to-video
- video-generation
- diffusers
pipeline_tag: text-to-video
base_model: deathlegionteam/LEGION-Video-Gen
inference: true
widget:
- text: "A serene mountain lake at sunset with colorful clouds reflecting on the water"
# βš”οΈ LEGION Video Generation
**State-of-the-art video generation with 8.3B parameters**
<p align="center">
<img src="https://img.shields.io/badge/Parameters-8.3B-blue" alt="Parameters">
<img src="https://img.shields.io/badge/License-Apache%202.0-green" alt="License">
<img src="https://img.shields.io/badge/VRAM-16GB%2B-red" alt="VRAM">
<img src="https://img.shields.io/badge/Text--to--Video-%E2%9C%85-brightgreen" alt="T2V">
<img src="https://img.shields.io/badge/Image--to--Video-%E2%9C%85-brightgreen" alt="I2V">
</p>
## Model Description
LEGION Video Generation is a production-ready video generation system with **8.3 billion parameters**. It supports text-to-video (T2V) and image-to-video (I2V) generation with temporal enhancement and configurable QWatermark system.
### Architecture
| Component | Description |
|-----------|-------------|
| **Transformer** | 54-layer 3D Diffusion Transformer |
| **VAE** | 3D causal VAE with 32-channel latent space |
| **Text Encoder** | Qwen2.5-VL (7B) + T5 Encoder ensemble for rich text understanding |
| **Scheduler** | Flow Matching Euler Discrete Scheduler with shifting |
| **Parameters** | 8.3 Billion total |
| **Precision** | FP16 (inference) |
### Key Features
- **🎬 Text-to-Video** β€” Generate cinematic videos from any text prompt
- **πŸ–ΌοΈ Image-to-Video** β€” Animate still images with controlled motion
- **πŸ’§ QWatermark System** β€” Configurable quality assurance watermark overlay
- **🌐 Web UI** β€” Gradio frontend with dark theme and FastAPI backend
- **πŸ“‘ REST API** β€” Programmatic access via HTTP endpoints
## Intended Use
### Direct Use
- **Video Generation**: Create high-quality videos from text descriptions
- **Content Creation**: Generate video assets for social media, marketing, and creative projects
- **Animation**: Animate still images with natural motion
- **Prototyping**: Rapid video ideation for film, game, and design workflows
### Out-of-Scope Use
- Generating deceptive or misleading video content (deepfakes)
- Creating violent, hateful, or otherwise harmful content
- Misrepresenting generated content as authentic footage
- Bypassing content safety systems
## How to Get Started
### Installation
```bash
# Clone the repository
git clone https://huggingface.co/deathlegionteam/LEGION-Video-Gen
cd LEGION-Video-Gen
# Install dependencies
pip install -r requirements.txt
```
### Basic Inference
```python
from diffusers import DiffusionPipeline
import torch
# Load the pipeline
pipe = DiffusionPipeline.from_pretrained(
"deathlegionteam/LEGION-Video-Gen",
torch_dtype=torch.float16,
)
pipe = pipe.to("cuda")
# Enable memory optimizations
pipe.vae.enable_tiling()
pipe.enable_attention_slicing()
pipe.enable_model_cpu_offload()
# Generate a video
video_frames = pipe(
prompt="A serene mountain lake at sunset with colorful clouds reflecting on the water, cinematic quality",
negative_prompt="warped, distorted, flickering, jittery, low quality, blurry, artifacts",
num_frames=49,
width=480,
height=480,
num_inference_steps=50,
guidance_scale=6.0,
).frames[0]
# Save as MP4
import imageio
imageio.mimsave("output.mp4", video_frames, fps=16, codec="libx264")
```
### Using the LEGION Generator Wrapper
```python
from inference import LegionVideoGenerator
generator = LegionVideoGenerator()
video_path = generator.generate_from_text(
prompt="A cyberpunk city street at night with neon lights reflecting on wet pavement",
num_frames=49,
width=480,
height=480,
num_inference_steps=50,
guidance_scale=6.0,
watermark_strength=0.3,
)
print(f"Video saved to: {video_path}")
```
## QWatermark System
The QWatermark (Quality Watermark) system imprints a configurable assurance marker:
| Parameter | Description | Default |
|-----------|-------------|---------|
| Text | Watermark text | "LEGION" |
| Position | Placement on frame | bottom-right |
| Font Size | Text size | 36 |
| Opacity | Transparency | 0.3 |
| Strength | Overall intensity | 0.0 (disabled) - 1.0 (full) |
## Limitations
1. **GPU Required**: Real inference requires 16GB+ VRAM GPU. CPU fallback provides mock/test patterns only.
2. **Resolution**: Optimized for 480p. Higher resolutions (720p+) require more VRAM.
3. **Video Length**: Generates up to 129 frames (~8 seconds at 16 FPS).
4. **Content Quality**: Results vary with prompt quality; complex scenes may show artifacts.
## License
This model is released under **Apache 2.0** License.
## Contact
- **Organization**: [deathlegionteam/LEGION-Video-Gen](https://huggingface.co/deathlegionteam/LEGION-Video-Gen)
<p align="center">
<strong>βš”οΈ LEGION VIDEO GENERATION</strong>
</p>