File size: 2,447 Bytes
1bc4fde | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 | ---
license: apache-2.0
tags:
- lora
- video-generation
- wan
- wan-2.1
- wan-2.2
- training
- text-to-video
- image-to-video
- diffusion-transformer
---
# Flimmer
Video LoRA training toolkit for diffusion transformer models. Built by [Alvdansen Labs](https://github.com/alvdansen).
Full pipeline from raw footage to trained LoRA checkpoint β scene detection, captioning,
dataset validation, latent pre-encoding, and training. Currently supports WAN 2.1 and
WAN 2.2 (T2V and I2V).
Early release. Building in the open.
## What it covers
- **Video ingestion** β scene detection, clip splitting, fps/resolution normalization
- **Captioning** β Gemini and Replicate backends
- **CLIP-based triage** β find clips matching a reference person or concept in large footage sets
- **Dataset validation** β catch missing captions, resolution mismatches, and format issues before spending GPU time
- **Latent pre-encoding** β VAE + T5 cached to disk so training doesn't repeat encoding every epoch
- **Training** β LoRA training with checkpoint resume, W&B logging, and in-training video sampling
## Phased training
The standout feature. Break a training run into sequential stages β each with its own
learning rate, epoch budget, and dataset β while the LoRA checkpoint carries forward
automatically between phases.
Use it for curriculum training (simple compositions before complex motion) or for
WAN 2.2's dual-expert MoE architecture, where the high-noise and low-noise experts
can be trained with specialized hyperparameters after a shared base phase.
MoE expert specialization is experimental β hyperparameters are still being validated.
## Standalone data tools
The data preparation tools output standard formats compatible with any trainer β
kohya, ai-toolkit, or anything else. You don't need to use Flimmer's training loop
to benefit from the captioning, triage, and validation tooling.
## Model support
| Model | T2V | I2V |
|---|---|---|
| WAN 2.1 | β
| β
|
| WAN 2.2 | β
| β
|
| LTX | π | π |
Image training is out of scope β ai-toolkit handles it thoroughly and there's no point
duplicating it. Flimmer is video-native.
## Installation & docs
Full installation instructions, config reference, and guides are on GitHub:
**[github.com/alvdansen/flimmer-trainer](https://github.com/alvdansen/flimmer-trainer)**
Supports RunPod and local GPU (tested on A6000/48GB). |