WAN 2.5 FP8 Image-to-Video LoRA Collection

This repository contains Low-Rank Adaptation (LoRA) models for the WAN 2.5 image-to-video generation system in FP8 precision format. These LoRAs provide specialized enhancements for camera control, lighting adjustment, motion control, and quality improvement in generated videos from static images.

Model Description

WAN 2.5 Image-to-Video LoRAs are adapter models that enhance the base WAN 2.5 model for animating static images into dynamic videos. These LoRAs provide specialized capabilities:

  • Camera Control LoRAs: Precise control over camera movements, angles, and transitions in animated scenes
  • Motion Control LoRAs: Natural object and character motion within the scene
  • Lighting LoRAs: Enhanced lighting conditions, cinematic effects, and mood adjustment
  • Quality Enhancement LoRAs: Improved temporal coherence, detail preservation, and visual fidelity
  • Style LoRAs: Artistic motion styles and aesthetic modifications

FP8 Precision: These LoRAs use 8-bit floating point format for:

  • ~50% reduced memory usage compared to FP16
  • Faster inference with minimal quality loss
  • Efficient deployment on consumer GPUs

Repository Contents

Current Status: Repository structure prepared, model files pending

wan25-fp8-loras/
└── loras/
    └── wan/
        β”œβ”€β”€ camera/          # Camera control LoRAs (pending)
        β”œβ”€β”€ lighting/        # Lighting enhancement LoRAs (pending)
        β”œβ”€β”€ quality/         # Quality improvement LoRAs (pending)
        └── style/           # Style transfer LoRAs (pending)

Expected File Structure (when populated):

loras/wan/
β”œβ”€β”€ camera/
β”‚   β”œβ”€β”€ camera_pan.safetensors          (~50-200 MB)
β”‚   β”œβ”€β”€ camera_zoom.safetensors         (~50-200 MB)
β”‚   └── camera_orbit.safetensors        (~50-200 MB)
β”œβ”€β”€ lighting/
β”‚   β”œβ”€β”€ lighting_cinematic.safetensors  (~50-200 MB)
β”‚   β”œβ”€β”€ lighting_natural.safetensors    (~50-200 MB)
β”‚   └── lighting_dramatic.safetensors   (~50-200 MB)
β”œβ”€β”€ quality/
β”‚   β”œβ”€β”€ quality_detail.safetensors      (~50-200 MB)
β”‚   └── quality_coherence.safetensors   (~50-200 MB)
└── style/
    β”œβ”€β”€ style_anime.safetensors         (~50-200 MB)
    └── style_realistic.safetensors     (~50-200 MB)

Total Estimated Size: 0.5 - 2 GB (depending on number of LoRAs)

Hardware Requirements

Minimum Requirements

  • VRAM: 12 GB (for base WAN model + 1-2 LoRAs)
  • RAM: 16 GB system memory
  • Disk Space: 2 GB for LoRA collection
  • GPU: NVIDIA RTX 3060 (12GB) or equivalent

Recommended Requirements

  • VRAM: 16-24 GB (for multiple LoRAs simultaneously)
  • RAM: 32 GB system memory
  • Disk Space: 5 GB (for experimentation and caching)
  • GPU: NVIDIA RTX 4070 Ti / RTX 4080 or equivalent

Memory Usage by LoRA Count

  • 1 LoRA: +200-400 MB VRAM
  • 2 LoRAs: +400-800 MB VRAM
  • 3+ LoRAs: +600-1200 MB VRAM

Usage Examples

Basic LoRA Loading with Diffusers

from diffusers import DiffusionPipeline
import torch

# Load base WAN 2.5 model
pipe = DiffusionPipeline.from_pretrained(
    "E:/huggingface/wan25-base",  # Path to base WAN model
    torch_dtype=torch.float8_e4m3fn,
    variant="fp8"
)

# Load camera control LoRA
pipe.load_lora_weights(
    "E:/huggingface/wan25-fp8-loras/loras/wan/camera",
    weight_name="camera_pan.safetensors",
    adapter_name="camera_pan"
)

# Set LoRA scale (0.0 to 1.0)
pipe.set_adapters(["camera_pan"], adapter_weights=[0.8])

# Generate video with camera pan effect
prompt = "A sweeping pan across a mountain landscape at sunset"
video = pipe(
    prompt=prompt,
    num_frames=120,
    height=720,
    width=1280,
    guidance_scale=7.5,
    num_inference_steps=50
).frames[0]

# Save video
import imageio
imageio.mimsave("output_video.mp4", video, fps=24)

Multiple LoRA Combination

# Load multiple LoRAs for combined effects
pipe.load_lora_weights(
    "E:/huggingface/wan25-fp8-loras/loras/wan/camera",
    weight_name="camera_pan.safetensors",
    adapter_name="camera"
)

pipe.load_lora_weights(
    "E:/huggingface/wan25-fp8-loras/loras/wan/lighting",
    weight_name="lighting_cinematic.safetensors",
    adapter_name="lighting"
)

pipe.load_lora_weights(
    "E:/huggingface/wan25-fp8-loras/loras/wan/quality",
    weight_name="quality_detail.safetensors",
    adapter_name="quality"
)

# Set weights for each LoRA
pipe.set_adapters(
    ["camera", "lighting", "quality"],
    adapter_weights=[0.8, 0.7, 0.6]
)

# Generate with combined effects
prompt = "Cinematic camera pan through a detailed futuristic cityscape"
video = pipe(prompt=prompt, num_frames=120).frames[0]

Dynamic LoRA Weight Adjustment

# Gradually increase LoRA effect during generation
def generate_with_dynamic_lora(pipe, prompt, lora_schedule):
    """
    lora_schedule: dict mapping frame ranges to LoRA weights
    Example: {(0, 40): 0.3, (40, 80): 0.7, (80, 120): 0.5}
    """
    frames = []
    for frame_range, weight in lora_schedule.items():
        start, end = frame_range
        pipe.set_adapters(["camera_pan"], adapter_weights=[weight])

        frame_chunk = pipe(
            prompt=prompt,
            num_frames=end - start,
            guidance_scale=7.5
        ).frames[0]

        frames.extend(frame_chunk)

    return frames

# Example: Start subtle, peak in middle, fade out
schedule = {
    (0, 40): 0.3,    # Subtle effect
    (40, 80): 0.8,   # Strong effect
    (80, 120): 0.4   # Fade out
}

video = generate_with_dynamic_lora(pipe, "Mountain landscape pan", schedule)

Model Specifications

LoRA Architecture

  • Format: SafeTensors (secure, efficient)
  • Precision: FP8 (8-bit floating point)
  • Base Model: WAN 2.5 video generation model
  • Rank: Typically 8-64 (determines LoRA capacity)
  • Target Modules: Cross-attention and temporal attention layers

LoRA Categories

Camera Control

  • Purpose: Precise camera movement and positioning
  • Types: Pan, tilt, zoom, orbit, dolly, tracking
  • Weight Range: 0.5-1.0 (higher for stronger camera effects)

Lighting Enhancement

  • Purpose: Cinematic and atmospheric lighting control
  • Types: Natural, dramatic, cinematic, studio, ambient
  • Weight Range: 0.4-0.8 (balanced for natural appearance)

Quality Improvement

  • Purpose: Enhanced detail, coherence, and visual fidelity
  • Types: Detail enhancement, temporal coherence, artifact reduction
  • Weight Range: 0.3-0.7 (subtle improvements)

Style Transfer

  • Purpose: Artistic style and aesthetic modifications
  • Types: Anime, realistic, painterly, sci-fi, fantasy
  • Weight Range: 0.6-1.0 (depends on desired style strength)

Technical Details

  • Training: Fine-tuned on specific video datasets
  • Compatibility: WAN 2.5 base model required
  • Inference: ~5-20% overhead per LoRA
  • Memory: ~100-200 MB per LoRA in FP8

Performance Tips and Optimization

LoRA Weight Tuning

  1. Start Low: Begin with weights around 0.5 and adjust upward
  2. Test Combinations: Some LoRAs may conflict at high weights
  3. Per-Category Guidelines:
    • Camera: 0.7-1.0 for strong movements
    • Lighting: 0.5-0.8 for natural appearance
    • Quality: 0.4-0.7 for subtle enhancement
    • Style: 0.6-1.0 depending on desired intensity

Memory Optimization

# Enable memory-efficient attention
pipe.enable_xformers_memory_efficient_attention()

# Use CPU offloading for limited VRAM
pipe.enable_model_cpu_offload()

# Reduce batch size for multiple LoRAs
pipe.enable_sequential_cpu_offload()

Inference Speed

  • Single LoRA: ~5-10% slowdown vs base model
  • 2-3 LoRAs: ~10-20% slowdown
  • 4+ LoRAs: ~20-30% slowdown
  • Tip: Merge compatible LoRAs offline for faster inference

Quality vs Speed Trade-offs

# Faster inference (reduced quality)
video = pipe(
    prompt=prompt,
    num_inference_steps=30,  # Reduced from 50
    guidance_scale=6.0       # Reduced from 7.5
)

# Higher quality (slower)
video = pipe(
    prompt=prompt,
    num_inference_steps=75,  # Increased
    guidance_scale=8.5       # Increased
)

LoRA Management Best Practices

  1. Unload Unused LoRAs: Free memory between generations

    pipe.unload_lora_weights()
    
  2. Cache Frequently Used: Keep common LoRAs loaded

  3. Test Individually: Verify each LoRA before combining

  4. Document Weights: Track successful weight combinations

  5. Version Control: Note which LoRAs work well together

License

License Type: Other (WAN-specific license)

This repository contains LoRA adapters for the WAN 2.5 model. Usage is governed by the WAN model license terms.

Usage Restrictions:

  • Intended for research and creative applications
  • Commercial use may require separate licensing
  • Refer to base WAN 2.5 model license for complete terms
  • LoRAs inherit base model usage restrictions

Attribution: When using these LoRAs, please credit:

  • Base WAN 2.5 model developers
  • LoRA training contributors (if applicable)
  • This repository for organization and documentation

Citation

If you use these WAN 2.5 LoRAs in your research or projects, please cite:

@software{wan25_fp8_loras,
  title={WAN 2.5 FP8 LoRA Collection},
  author={WAN Contributors},
  year={2025},
  url={https://huggingface.co/wan25-fp8-loras},
  note={Low-Rank Adaptation models for WAN 2.5 video generation}
}

For the base WAN model, cite:

@software{wan25,
  title={WAN 2.5: Video Generation Model},
  author={WAN Research Team},
  year={2025},
  url={https://huggingface.co/WAN/wan25}
}

Additional Resources

Official Links

Related Repositories

  • E:/huggingface/wan25-base - Base WAN 2.5 model
  • E:/huggingface/wan25-vae - WAN 2.5 VAE components
  • E:/huggingface/flux-dev-fp8 - FLUX.1 image generation models

Tutorials and Examples

  • Video Generation Pipeline: See base WAN documentation
  • LoRA Fine-tuning: Guide for creating custom LoRAs
  • Advanced Techniques: Multi-LoRA composition strategies
  • Troubleshooting: Common issues and solutions

Changelog

Version 1.5 (2025-10-28)

  • CRITICAL FIX: Corrected pipeline_tag to image-to-video (accurate for I2V LoRAs)
  • Updated title and description to emphasize image-to-video capability
  • Added motion control LoRA category for I2V-specific functionality
  • Refined tags: removed text-to-video from previous version, kept image-to-video, lora, video-generation
  • Enhanced model description with I2V-specific features
  • Aligned metadata with image-to-video pipeline classification

Version 1.4 (2025-10-28)

  • Corrected pipeline_tag metadata (image-to-video β†’ text-to-video)
  • Updated tags to reflect primary WAN capability (text-to-video, image-generation)
  • Aligned metadata with WAN 2.5 base model classification

Version 1.3 (2025-10-14)

  • Fixed pipeline_tag metadata (text-to-video β†’ image-to-video)
  • Updated tags for better discoverability (added lora, video-generation)
  • Corrected repository classification for image-to-video task

Version 1.2 (2025-10-14)

  • Updated README documentation and metadata
  • Enhanced usage examples and specifications
  • Improved organization and clarity

Version 1.0 (2025-10-13)

  • Initial repository structure created
  • Documentation and usage examples provided
  • Organized subdirectories for LoRA categories
  • Prepared for model file deployment

Planned Additions

  • Camera control LoRA suite
  • Lighting enhancement LoRAs
  • Quality improvement adapters
  • Style transfer collections
  • Community-contributed LoRAs

Contact and Support

For questions, issues, or contributions:

  • Repository Issues: [Create an issue]
  • Email: [Contact information]
  • Community: [Discord/Forum links]
  • Contributions: Pull requests welcome for new LoRAs

Repository Status: Structure prepared, awaiting model files

Last Updated: 2025-10-28

Maintained By: WAN LoRA Collection Team

Downloads last month
-
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ 1 Ask for provider support

Collection including wangkanai/wan25-fp8-i2v-loras