Upload MODEL_CARD.md with huggingface_hub

bb01e26 verified about 19 hours ago

5 kB

	---
	license: apache-2.0
	library_name: diffusers
	tags:
	- text-to-video
	- image-to-video
	- video-generation
	- diffusers
	pipeline_tag: text-to-video
	base_model: deathlegionteam/LEGION-Video-Gen
	inference: true
	widget:
	- text: "A serene mountain lake at sunset with colorful clouds reflecting on the water"
	# ⚔️ LEGION Video Generation

	State-of-the-art video generation with 8.3B parameters

	<p align="center">
	<img src="https://img.shields.io/badge/Parameters-8.3B-blue" alt="Parameters">
	<img src="https://img.shields.io/badge/License-Apache%202.0-green" alt="License">
	<img src="https://img.shields.io/badge/VRAM-16GB%2B-red" alt="VRAM">
	<img src="https://img.shields.io/badge/Text--to--Video-%E2%9C%85-brightgreen" alt="T2V">
	<img src="https://img.shields.io/badge/Image--to--Video-%E2%9C%85-brightgreen" alt="I2V">
	</p>

	## Model Description

	LEGION Video Generation is a production-ready video generation system with 8.3 billion parameters. It supports text-to-video (T2V) and image-to-video (I2V) generation with temporal enhancement and configurable QWatermark system.

	### Architecture

	\| Component \| Description \|
	\|-----------\|-------------\|
	\| Transformer \| 54-layer 3D Diffusion Transformer \|
	\| VAE \| 3D causal VAE with 32-channel latent space \|
	\| Text Encoder \| Qwen2.5-VL (7B) + T5 Encoder ensemble for rich text understanding \|
	\| Scheduler \| Flow Matching Euler Discrete Scheduler with shifting \|
	\| Parameters \| 8.3 Billion total \|
	\| Precision \| FP16 (inference) \|

	### Key Features

	- 🎬 Text-to-Video — Generate cinematic videos from any text prompt
	- 🖼️ Image-to-Video — Animate still images with controlled motion
	- 💧 QWatermark System — Configurable quality assurance watermark overlay
	- 🌐 Web UI — Gradio frontend with dark theme and FastAPI backend
	- 📡 REST API — Programmatic access via HTTP endpoints

	## Intended Use

	### Direct Use

	- Video Generation: Create high-quality videos from text descriptions
	- Content Creation: Generate video assets for social media, marketing, and creative projects
	- Animation: Animate still images with natural motion
	- Prototyping: Rapid video ideation for film, game, and design workflows

	### Out-of-Scope Use

	- Generating deceptive or misleading video content (deepfakes)
	- Creating violent, hateful, or otherwise harmful content
	- Misrepresenting generated content as authentic footage
	- Bypassing content safety systems

	## How to Get Started

	### Installation

	```bash
	# Clone the repository
	git clone https://huggingface.co/deathlegionteam/LEGION-Video-Gen
	cd LEGION-Video-Gen

	# Install dependencies
	pip install -r requirements.txt
	```

	### Basic Inference

	```python
	from diffusers import DiffusionPipeline
	import torch

	# Load the pipeline
	pipe = DiffusionPipeline.from_pretrained(
	"deathlegionteam/LEGION-Video-Gen",
	torch_dtype=torch.float16,
	)
	pipe = pipe.to("cuda")

	# Enable memory optimizations
	pipe.vae.enable_tiling()
	pipe.enable_attention_slicing()
	pipe.enable_model_cpu_offload()

	# Generate a video
	video_frames = pipe(
	prompt="A serene mountain lake at sunset with colorful clouds reflecting on the water, cinematic quality",
	negative_prompt="warped, distorted, flickering, jittery, low quality, blurry, artifacts",
	num_frames=49,
	width=480,
	height=480,
	num_inference_steps=50,
	guidance_scale=6.0,
	).frames[0]

	# Save as MP4
	import imageio
	imageio.mimsave("output.mp4", video_frames, fps=16, codec="libx264")
	```

	### Using the LEGION Generator Wrapper

	```python
	from inference import LegionVideoGenerator

	generator = LegionVideoGenerator()
	video_path = generator.generate_from_text(
	prompt="A cyberpunk city street at night with neon lights reflecting on wet pavement",
	num_frames=49,
	width=480,
	height=480,
	num_inference_steps=50,
	guidance_scale=6.0,
	watermark_strength=0.3,
	)
	print(f"Video saved to: {video_path}")
	```

	## QWatermark System

	The QWatermark (Quality Watermark) system imprints a configurable assurance marker:

	\| Parameter \| Description \| Default \|
	\|-----------\|-------------\|---------\|
	\| Text \| Watermark text \| "LEGION" \|
	\| Position \| Placement on frame \| bottom-right \|
	\| Font Size \| Text size \| 36 \|
	\| Opacity \| Transparency \| 0.3 \|
	\| Strength \| Overall intensity \| 0.0 (disabled) - 1.0 (full) \|

	## Limitations

	1. GPU Required: Real inference requires 16GB+ VRAM GPU. CPU fallback provides mock/test patterns only.
	2. Resolution: Optimized for 480p. Higher resolutions (720p+) require more VRAM.
	3. Video Length: Generates up to 129 frames (~8 seconds at 16 FPS).
	4. Content Quality: Results vary with prompt quality; complex scenes may show artifacts.

	## License

	This model is released under Apache 2.0 License.

	## Contact

	- Organization: [deathlegionteam/LEGION-Video-Gen](https://huggingface.co/deathlegionteam/LEGION-Video-Gen)

	<p align="center">
	<strong>⚔️ LEGION VIDEO GENERATION</strong>
	</p>