| # FSampler - Fast Diffusion Sampling via Epsilon Extrapolation | |
| FSampler is a training-free, sampler-agnostic acceleration layer for diffusion sampling that reduces model calls by predicting each step's epsilon (noise) from recent real calls and feeding it into the existing integrator. | |
| ## NOTE | |
| **Important Compatibility Information:** | |
| - The RES family samplers will not produce 1:1 parity with the official KSampler or ClownShark KSampler implementations- despite having 1:1 code | |
| - Even ComfyUI vs ClownShark produce different results, due to environment variables and implementation details | |
| - Simple samplers like Euler will have 1:1 parity across implementations | |
| ## Features | |
| - **Training-Free Acceleration**: Skips full model calls using predicted epsilon from recent steps | |
| - **Sampler Agnostic**: Works with Euler, RES 2M/2S, DDIM, DPM++ 2M/2S, LMS, and more | |
| - **Multiple Skip Modes**: | |
| - Fixed modes (h2/h3/h4): Conservative, deterministic speedup | |
| - Adaptive mode: Aggressive skipping for 40-60%+ speedup | |
| - Skip indices: Manually pick which steps to skip (useful for low step counts) | |
| - **Built-in Stability**: Universal learning stabilizer and validators prevent artifacts | |
| ## Skip Modes | |
| - **none**: baseline (no skipping) | |
| - **hN/sK**: h=history used for predictor, s=steps/calls before skip | |
| - **h2/s2..s5**: linear predictor; common picks h2/s2 (~24%) or h2/s3 (~20%+) | |
| - **h3/s3..s5**: Richardson; common picks h3/s3 (~16%) or h3/s4 (~12%+) | |
| - **h4/s4..s5**: cubic; conservative, quality-sensitive; typically h4/s4 | |
| - **adaptive**: aggressive skip gate using two predictors (h3 vs h2) in predicted-state space | |
| ### Skip Indices | |
| - For low step count workflows, use skip indices to manually pick which steps to skip | |
| - Gives you precise control over the sampling process | |
| ## Usage | |
| **For quick usage start with FSampler (simple) rather than FSampler Advanced** - the simple version only needs noise and skip mode to operate. Swap with your normal KSampler node. | |
| 1. Add the **FSampler** node (or **FSampler Advanced** for more control) | |
| 2. Choose your **sampler** and **scheduler** as usual | |
| 3. Set **skip_mode**: | |
| - `none` β baseline (no skipping, use this first to validate) | |
| - `h2` β conservative, ~20-30% speedup (recommended starting point) | |
| - `h3` β more conservative, ~16% speedup | |
| - `h4` β very conservative, ~12% speedup | |
| - `adaptive` β aggressive, 40-60%+ speedup (may degrade on tough configs) | |
| 4. Adjust **protect_first_steps** / **protect_last_steps** if needed (defaults are usually fine) | |
| ## Quality & Safety | |
| - **Validators**: finite checks, magnitude clamp vs history, cosine vs last REAL epsilon | |
| - **Learning stabilizer L**: scales predicted epsilon by 1/L on skipped steps; updates on REAL steps only | |
| - **Diagnostics**: per-step timing + concise line showing Ο targets, h/weights (where relevant), epsilon norms, x_rms, and [RISK] | |
| ## Notes | |
| - h2/h3/h4 are conservative and deterministic; adaptive is aggressive and may show degradation on tough configs β validators and L minimize artifacts | |
| - Protect first/last windows guard early/late critical regions | |
| - Anchors and max consecutive skips are internal to adaptive to bound drift | |
| - Works with LoRAs, ControlNet, IP-Adapter | |
| - Since all equations are deterministic, running high skips will still produce very similar results as if ran with no skips meaning you can generate a lot more tests quicker before using a single seed for production | |
| ## Troubleshooting | |
| ### Getting artifacts or weird images? | |
| 1. Use `skip_mode=none` to verify baseline quality | |
| 2. Switch to `h2` or `h3` (more conservative than adaptive) | |
| 3. Increase `protect_first_steps` and `protect_last_steps` | |
| 4. Some sampler+scheduler combos produce issues even without skipping | |
| ### Not seeing speedup? | |
| - FSampler needs history to extrapolate - works best with 10+ steps | |
| ## FAQ | |
| **Q: Does this work with LoRAs/ControlNet/IP-Adapter?** | |
| A: Yes! FSampler sits between the scheduler and sampler, so it's transparent to conditioning. | |
| **Q: Will this work on SDXL Turbo / LCM?** | |
| A: Potentially, but low-step models (<10 steps) won't benefit much since there's less history to extrapolate from. Use explicit skip indices for precise control with low step counts. | |
| **Q: Can I use this with custom schedulers?** | |
| A: Yes, FSampler works with any scheduler that produces sigma values. | |
| **Q: How does this compare to other speedup methods?** | |
| A: FSampler is complementary to: | |
| - **Distillation** (LCM, Turbo): Use both together | |
| - **Quantization**: Use both together | |
| - **Dynamic CFG**: Use both together | |
| - FSampler specifically reduces *sampling steps*, not model inference cost | |
| ## Tested Models | |
| - Flux (tested on 2080ti with LoRAs, f8 and f16 models) | |
| - Wan2.2 | |
| - Qwen | |
| Testing and feedback welcome on other models! | |