File size: 4,868 Bytes
c6535db | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 | # FSampler - Fast Diffusion Sampling via Epsilon Extrapolation
FSampler is a training-free, sampler-agnostic acceleration layer for diffusion sampling that reduces model calls by predicting each step's epsilon (noise) from recent real calls and feeding it into the existing integrator.
## NOTE
**Important Compatibility Information:**
- The RES family samplers will not produce 1:1 parity with the official KSampler or ClownShark KSampler implementations- despite having 1:1 code
- Even ComfyUI vs ClownShark produce different results, due to environment variables and implementation details
- Simple samplers like Euler will have 1:1 parity across implementations
## Features
- **Training-Free Acceleration**: Skips full model calls using predicted epsilon from recent steps
- **Sampler Agnostic**: Works with Euler, RES 2M/2S, DDIM, DPM++ 2M/2S, LMS, and more
- **Multiple Skip Modes**:
- Fixed modes (h2/h3/h4): Conservative, deterministic speedup
- Adaptive mode: Aggressive skipping for 40-60%+ speedup
- Skip indices: Manually pick which steps to skip (useful for low step counts)
- **Built-in Stability**: Universal learning stabilizer and validators prevent artifacts
## Skip Modes
- **none**: baseline (no skipping)
- **hN/sK**: h=history used for predictor, s=steps/calls before skip
- **h2/s2..s5**: linear predictor; common picks h2/s2 (~24%) or h2/s3 (~20%+)
- **h3/s3..s5**: Richardson; common picks h3/s3 (~16%) or h3/s4 (~12%+)
- **h4/s4..s5**: cubic; conservative, quality-sensitive; typically h4/s4
- **adaptive**: aggressive skip gate using two predictors (h3 vs h2) in predicted-state space
### Skip Indices
- For low step count workflows, use skip indices to manually pick which steps to skip
- Gives you precise control over the sampling process
## Usage
**For quick usage start with FSampler (simple) rather than FSampler Advanced** - the simple version only needs noise and skip mode to operate. Swap with your normal KSampler node.
1. Add the **FSampler** node (or **FSampler Advanced** for more control)
2. Choose your **sampler** and **scheduler** as usual
3. Set **skip_mode**:
- `none` — baseline (no skipping, use this first to validate)
- `h2` — conservative, ~20-30% speedup (recommended starting point)
- `h3` — more conservative, ~16% speedup
- `h4` — very conservative, ~12% speedup
- `adaptive` — aggressive, 40-60%+ speedup (may degrade on tough configs)
4. Adjust **protect_first_steps** / **protect_last_steps** if needed (defaults are usually fine)
## Quality & Safety
- **Validators**: finite checks, magnitude clamp vs history, cosine vs last REAL epsilon
- **Learning stabilizer L**: scales predicted epsilon by 1/L on skipped steps; updates on REAL steps only
- **Diagnostics**: per-step timing + concise line showing σ targets, h/weights (where relevant), epsilon norms, x_rms, and [RISK]
## Notes
- h2/h3/h4 are conservative and deterministic; adaptive is aggressive and may show degradation on tough configs — validators and L minimize artifacts
- Protect first/last windows guard early/late critical regions
- Anchors and max consecutive skips are internal to adaptive to bound drift
- Works with LoRAs, ControlNet, IP-Adapter
- Since all equations are deterministic, running high skips will still produce very similar results as if ran with no skips meaning you can generate a lot more tests quicker before using a single seed for production
## Troubleshooting
### Getting artifacts or weird images?
1. Use `skip_mode=none` to verify baseline quality
2. Switch to `h2` or `h3` (more conservative than adaptive)
3. Increase `protect_first_steps` and `protect_last_steps`
4. Some sampler+scheduler combos produce issues even without skipping
### Not seeing speedup?
- FSampler needs history to extrapolate - works best with 10+ steps
## FAQ
**Q: Does this work with LoRAs/ControlNet/IP-Adapter?**
A: Yes! FSampler sits between the scheduler and sampler, so it's transparent to conditioning.
**Q: Will this work on SDXL Turbo / LCM?**
A: Potentially, but low-step models (<10 steps) won't benefit much since there's less history to extrapolate from. Use explicit skip indices for precise control with low step counts.
**Q: Can I use this with custom schedulers?**
A: Yes, FSampler works with any scheduler that produces sigma values.
**Q: How does this compare to other speedup methods?**
A: FSampler is complementary to:
- **Distillation** (LCM, Turbo): Use both together
- **Quantization**: Use both together
- **Dynamic CFG**: Use both together
- FSampler specifically reduces *sampling steps*, not model inference cost
## Tested Models
- Flux (tested on 2080ti with LoRAs, f8 and f16 models)
- Wan2.2
- Qwen
Testing and feedback welcome on other models!
|