alvdansen commited on
Commit
1bc4fde
Β·
verified Β·
1 Parent(s): 5003c8c

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +68 -3
README.md CHANGED
@@ -1,3 +1,68 @@
1
- ---
2
- license: apache-2.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ tags:
4
+ - lora
5
+ - video-generation
6
+ - wan
7
+ - wan-2.1
8
+ - wan-2.2
9
+ - training
10
+ - text-to-video
11
+ - image-to-video
12
+ - diffusion-transformer
13
+ ---
14
+
15
+ # Flimmer
16
+
17
+ Video LoRA training toolkit for diffusion transformer models. Built by [Alvdansen Labs](https://github.com/alvdansen).
18
+
19
+ Full pipeline from raw footage to trained LoRA checkpoint β€” scene detection, captioning,
20
+ dataset validation, latent pre-encoding, and training. Currently supports WAN 2.1 and
21
+ WAN 2.2 (T2V and I2V).
22
+
23
+ Early release. Building in the open.
24
+
25
+ ## What it covers
26
+
27
+ - **Video ingestion** β€” scene detection, clip splitting, fps/resolution normalization
28
+ - **Captioning** β€” Gemini and Replicate backends
29
+ - **CLIP-based triage** β€” find clips matching a reference person or concept in large footage sets
30
+ - **Dataset validation** β€” catch missing captions, resolution mismatches, and format issues before spending GPU time
31
+ - **Latent pre-encoding** β€” VAE + T5 cached to disk so training doesn't repeat encoding every epoch
32
+ - **Training** β€” LoRA training with checkpoint resume, W&B logging, and in-training video sampling
33
+
34
+ ## Phased training
35
+
36
+ The standout feature. Break a training run into sequential stages β€” each with its own
37
+ learning rate, epoch budget, and dataset β€” while the LoRA checkpoint carries forward
38
+ automatically between phases.
39
+
40
+ Use it for curriculum training (simple compositions before complex motion) or for
41
+ WAN 2.2's dual-expert MoE architecture, where the high-noise and low-noise experts
42
+ can be trained with specialized hyperparameters after a shared base phase.
43
+ MoE expert specialization is experimental β€” hyperparameters are still being validated.
44
+
45
+ ## Standalone data tools
46
+
47
+ The data preparation tools output standard formats compatible with any trainer β€”
48
+ kohya, ai-toolkit, or anything else. You don't need to use Flimmer's training loop
49
+ to benefit from the captioning, triage, and validation tooling.
50
+
51
+ ## Model support
52
+
53
+ | Model | T2V | I2V |
54
+ |---|---|---|
55
+ | WAN 2.1 | βœ… | βœ… |
56
+ | WAN 2.2 | βœ… | βœ… |
57
+ | LTX | πŸ”œ | πŸ”œ |
58
+
59
+ Image training is out of scope β€” ai-toolkit handles it thoroughly and there's no point
60
+ duplicating it. Flimmer is video-native.
61
+
62
+ ## Installation & docs
63
+
64
+ Full installation instructions, config reference, and guides are on GitHub:
65
+
66
+ **[github.com/alvdansen/flimmer-trainer](https://github.com/alvdansen/flimmer-trainer)**
67
+
68
+ Supports RunPod and local GPU (tested on A6000/48GB).