WAN 2.5 FP8 Image-to-Video LoRA Collection

This repository contains Low-Rank Adaptation (LoRA) models for the WAN 2.5 image-to-video generation system in FP8 precision format. These LoRAs provide specialized enhancements for camera control, lighting adjustment, motion control, and quality improvement in generated videos from static images.

Model Description

WAN 2.5 Image-to-Video LoRAs are adapter models that enhance the base WAN 2.5 model for animating static images into dynamic videos. These LoRAs provide specialized capabilities:

Camera Control LoRAs: Precise control over camera movements, angles, and transitions in animated scenes
Motion Control LoRAs: Natural object and character motion within the scene
Lighting LoRAs: Enhanced lighting conditions, cinematic effects, and mood adjustment
Quality Enhancement LoRAs: Improved temporal coherence, detail preservation, and visual fidelity
Style LoRAs: Artistic motion styles and aesthetic modifications

FP8 Precision: These LoRAs use 8-bit floating point format for:

~50% reduced memory usage compared to FP16
Faster inference with minimal quality loss
Efficient deployment on consumer GPUs

Repository Contents

Current Status: Repository structure prepared, model files pending

wan25-fp8-loras/
└── loras/
    └── wan/
        ├── camera/          # Camera control LoRAs (pending)
        ├── lighting/        # Lighting enhancement LoRAs (pending)
        ├── quality/         # Quality improvement LoRAs (pending)
        └── style/           # Style transfer LoRAs (pending)

Expected File Structure (when populated):

loras/wan/
├── camera/
│   ├── camera_pan.safetensors          (~50-200 MB)
│   ├── camera_zoom.safetensors         (~50-200 MB)
│   └── camera_orbit.safetensors        (~50-200 MB)
├── lighting/
│   ├── lighting_cinematic.safetensors  (~50-200 MB)
│   ├── lighting_natural.safetensors    (~50-200 MB)
│   └── lighting_dramatic.safetensors   (~50-200 MB)
├── quality/
│   ├── quality_detail.safetensors      (~50-200 MB)
│   └── quality_coherence.safetensors   (~50-200 MB)
└── style/
    ├── style_anime.safetensors         (~50-200 MB)
    └── style_realistic.safetensors     (~50-200 MB)

Total Estimated Size: 0.5 - 2 GB (depending on number of LoRAs)

Hardware Requirements

Minimum Requirements

VRAM: 12 GB (for base WAN model + 1-2 LoRAs)
RAM: 16 GB system memory
Disk Space: 2 GB for LoRA collection
GPU: NVIDIA RTX 3060 (12GB) or equivalent

Recommended Requirements

VRAM: 16-24 GB (for multiple LoRAs simultaneously)
RAM: 32 GB system memory
Disk Space: 5 GB (for experimentation and caching)
GPU: NVIDIA RTX 4070 Ti / RTX 4080 or equivalent

Memory Usage by LoRA Count

1 LoRA: +200-400 MB VRAM
2 LoRAs: +400-800 MB VRAM
3+ LoRAs: +600-1200 MB VRAM

Usage Examples

Basic LoRA Loading with Diffusers

from diffusers import DiffusionPipeline
import torch

# Load base WAN 2.5 model
pipe = DiffusionPipeline.from_pretrained(
    "E:/huggingface/wan25-base",  # Path to base WAN model
    torch_dtype=torch.float8_e4m3fn,
    variant="fp8"
)

# Load camera control LoRA
pipe.load_lora_weights(
    "E:/huggingface/wan25-fp8-loras/loras/wan/camera",
    weight_name="camera_pan.safetensors",
    adapter_name="camera_pan"
)

# Set LoRA scale (0.0 to 1.0)
pipe.set_adapters(["camera_pan"], adapter_weights=[0.8])

# Generate video with camera pan effect
prompt = "A sweeping pan across a mountain landscape at sunset"
video = pipe(
    prompt=prompt,
    num_frames=120,
    height=720,
    width=1280,
    guidance_scale=7.5,
    num_inference_steps=50
).frames[0]

# Save video
import imageio
imageio.mimsave("output_video.mp4", video, fps=24)

Multiple LoRA Combination

# Load multiple LoRAs for combined effects
pipe.load_lora_weights(
    "E:/huggingface/wan25-fp8-loras/loras/wan/camera",
    weight_name="camera_pan.safetensors",
    adapter_name="camera"
)

pipe.load_lora_weights(
    "E:/huggingface/wan25-fp8-loras/loras/wan/lighting",
    weight_name="lighting_cinematic.safetensors",
    adapter_name="lighting"
)

pipe.load_lora_weights(
    "E:/huggingface/wan25-fp8-loras/loras/wan/quality",
    weight_name="quality_detail.safetensors",
    adapter_name="quality"
)

# Set weights for each LoRA
pipe.set_adapters(
    ["camera", "lighting", "quality"],
    adapter_weights=[0.8, 0.7, 0.6]
)

# Generate with combined effects
prompt = "Cinematic camera pan through a detailed futuristic cityscape"
video = pipe(prompt=prompt, num_frames=120).frames[0]

Dynamic LoRA Weight Adjustment

# Gradually increase LoRA effect during generation
def generate_with_dynamic_lora(pipe, prompt, lora_schedule):
    """
    lora_schedule: dict mapping frame ranges to LoRA weights
    Example: {(0, 40): 0.3, (40, 80): 0.7, (80, 120): 0.5}
    """
    frames = []
    for frame_range, weight in lora_schedule.items():
        start, end = frame_range
        pipe.set_adapters(["camera_pan"], adapter_weights=[weight])

        frame_chunk = pipe(
            prompt=prompt,
            num_frames=end - start,
            guidance_scale=7.5
        ).frames[0]

        frames.extend(frame_chunk)

    return frames

# Example: Start subtle, peak in middle, fade out
schedule = {
    (0, 40): 0.3,    # Subtle effect
    (40, 80): 0.8,   # Strong effect
    (80, 120): 0.4   # Fade out
}

video = generate_with_dynamic_lora(pipe, "Mountain landscape pan", schedule)

Model Specifications

LoRA Architecture

Format: SafeTensors (secure, efficient)
Precision: FP8 (8-bit floating point)
Base Model: WAN 2.5 video generation model
Rank: Typically 8-64 (determines LoRA capacity)
Target Modules: Cross-attention and temporal attention layers

LoRA Categories

Camera Control

Purpose: Precise camera movement and positioning
Types: Pan, tilt, zoom, orbit, dolly, tracking
Weight Range: 0.5-1.0 (higher for stronger camera effects)

Lighting Enhancement

Purpose: Cinematic and atmospheric lighting control
Types: Natural, dramatic, cinematic, studio, ambient
Weight Range: 0.4-0.8 (balanced for natural appearance)

Quality Improvement

Purpose: Enhanced detail, coherence, and visual fidelity
Types: Detail enhancement, temporal coherence, artifact reduction
Weight Range: 0.3-0.7 (subtle improvements)

Style Transfer

Purpose: Artistic style and aesthetic modifications
Types: Anime, realistic, painterly, sci-fi, fantasy
Weight Range: 0.6-1.0 (depends on desired style strength)

Technical Details

Training: Fine-tuned on specific video datasets
Compatibility: WAN 2.5 base model required
Inference: ~5-20% overhead per LoRA
Memory: ~100-200 MB per LoRA in FP8

Performance Tips and Optimization

LoRA Weight Tuning

Start Low: Begin with weights around 0.5 and adjust upward
Test Combinations: Some LoRAs may conflict at high weights
Per-Category Guidelines:
- Camera: 0.7-1.0 for strong movements
- Lighting: 0.5-0.8 for natural appearance
- Quality: 0.4-0.7 for subtle enhancement
- Style: 0.6-1.0 depending on desired intensity

Memory Optimization

# Enable memory-efficient attention
pipe.enable_xformers_memory_efficient_attention()

# Use CPU offloading for limited VRAM
pipe.enable_model_cpu_offload()

# Reduce batch size for multiple LoRAs
pipe.enable_sequential_cpu_offload()

Inference Speed

Single LoRA: ~5-10% slowdown vs base model
2-3 LoRAs: ~10-20% slowdown
4+ LoRAs: ~20-30% slowdown
Tip: Merge compatible LoRAs offline for faster inference

Quality vs Speed Trade-offs

# Faster inference (reduced quality)
video = pipe(
    prompt=prompt,
    num_inference_steps=30,  # Reduced from 50
    guidance_scale=6.0       # Reduced from 7.5
)

# Higher quality (slower)
video = pipe(
    prompt=prompt,
    num_inference_steps=75,  # Increased
    guidance_scale=8.5       # Increased
)

LoRA Management Best Practices

Unload Unused LoRAs: Free memory between generations
```
pipe.unload_lora_weights()
```
Cache Frequently Used: Keep common LoRAs loaded
Test Individually: Verify each LoRA before combining
Document Weights: Track successful weight combinations
Version Control: Note which LoRAs work well together

License

License Type: Other (WAN-specific license)

This repository contains LoRA adapters for the WAN 2.5 model. Usage is governed by the WAN model license terms.

Usage Restrictions:

Intended for research and creative applications
Commercial use may require separate licensing
Refer to base WAN 2.5 model license for complete terms
LoRAs inherit base model usage restrictions

Attribution: When using these LoRAs, please credit:

Base WAN 2.5 model developers
LoRA training contributors (if applicable)
This repository for organization and documentation

Citation

If you use these WAN 2.5 LoRAs in your research or projects, please cite:

@software{wan25_fp8_loras,
  title={WAN 2.5 FP8 LoRA Collection},
  author={WAN Contributors},
  year={2025},
  url={https://huggingface.co/wan25-fp8-loras},
  note={Low-Rank Adaptation models for WAN 2.5 video generation}
}

For the base WAN model, cite:

@software{wan25,
  title={WAN 2.5: Video Generation Model},
  author={WAN Research Team},
  year={2025},
  url={https://huggingface.co/WAN/wan25}
}

Additional Resources

Official Links

WAN Model: https://huggingface.co/WAN/wan25
Documentation: [WAN Official Docs]
Community: [WAN Discord/Forum]
Issues: Report issues with specific LoRAs

Related Repositories

E:/huggingface/wan25-base - Base WAN 2.5 model
E:/huggingface/wan25-vae - WAN 2.5 VAE components
E:/huggingface/flux-dev-fp8 - FLUX.1 image generation models

Tutorials and Examples

Video Generation Pipeline: See base WAN documentation
LoRA Fine-tuning: Guide for creating custom LoRAs
Advanced Techniques: Multi-LoRA composition strategies
Troubleshooting: Common issues and solutions

Changelog

Version 1.5 (2025-10-28)

CRITICAL FIX: Corrected pipeline_tag to image-to-video (accurate for I2V LoRAs)
Updated title and description to emphasize image-to-video capability
Added motion control LoRA category for I2V-specific functionality
Refined tags: removed text-to-video from previous version, kept image-to-video, lora, video-generation
Enhanced model description with I2V-specific features
Aligned metadata with image-to-video pipeline classification

Version 1.4 (2025-10-28)

Corrected pipeline_tag metadata (image-to-video → text-to-video)
Updated tags to reflect primary WAN capability (text-to-video, image-generation)
Aligned metadata with WAN 2.5 base model classification

Version 1.3 (2025-10-14)

Fixed pipeline_tag metadata (text-to-video → image-to-video)
Updated tags for better discoverability (added lora, video-generation)
Corrected repository classification for image-to-video task

Version 1.2 (2025-10-14)

Updated README documentation and metadata
Enhanced usage examples and specifications
Improved organization and clarity

Version 1.0 (2025-10-13)

Initial repository structure created
Documentation and usage examples provided
Organized subdirectories for LoRA categories
Prepared for model file deployment

Planned Additions

Camera control LoRA suite
Lighting enhancement LoRAs
Quality improvement adapters
Style transfer collections
Community-contributed LoRAs

Contact and Support

For questions, issues, or contributions:

Repository Issues: [Create an issue]
Email: [Contact information]
Community: [Discord/Forum links]
Contributions: Pull requests welcome for new LoRAs

Repository Status: Structure prepared, awaiting model files

Last Updated: 2025-10-28

Maintained By: WAN LoRA Collection Team

Downloads last month: -

Collection including wangkanai/wan25-fp8-i2v-loras

wan-2.5

Collection

wan 2.5 video models • 7 items • Updated Mar 2 • 7