FSampler - Fast Diffusion Sampling via Epsilon Extrapolation

FSampler is a training-free, sampler-agnostic acceleration layer for diffusion sampling that reduces model calls by predicting each step's epsilon (noise) from recent real calls and feeding it into the existing integrator.

NOTE

Important Compatibility Information:

The RES family samplers will not produce 1:1 parity with the official KSampler or ClownShark KSampler implementations- despite having 1:1 code
Even ComfyUI vs ClownShark produce different results, due to environment variables and implementation details
Simple samplers like Euler will have 1:1 parity across implementations

Features

Training-Free Acceleration: Skips full model calls using predicted epsilon from recent steps
Sampler Agnostic: Works with Euler, RES 2M/2S, DDIM, DPM++ 2M/2S, LMS, and more
Multiple Skip Modes:
- Fixed modes (h2/h3/h4): Conservative, deterministic speedup
- Adaptive mode: Aggressive skipping for 40-60%+ speedup
- Skip indices: Manually pick which steps to skip (useful for low step counts)
Built-in Stability: Universal learning stabilizer and validators prevent artifacts

Skip Modes

none: baseline (no skipping)
hN/sK: h=history used for predictor, s=steps/calls before skip
- h2/s2..s5: linear predictor; common picks h2/s2 (~~24%) or h2/s3 (~~20%+)
- h3/s3..s5: Richardson; common picks h3/s3 (~~16%) or h3/s4 (~~12%+)
- h4/s4..s5: cubic; conservative, quality-sensitive; typically h4/s4
adaptive: aggressive skip gate using two predictors (h3 vs h2) in predicted-state space

Skip Indices

For low step count workflows, use skip indices to manually pick which steps to skip
Gives you precise control over the sampling process

Usage

For quick usage start with FSampler (simple) rather than FSampler Advanced - the simple version only needs noise and skip mode to operate. Swap with your normal KSampler node.

Add the FSampler node (or FSampler Advanced for more control)
Choose your sampler and scheduler as usual
Set skip_mode:
- none — baseline (no skipping, use this first to validate)
- h2 — conservative, ~20-30% speedup (recommended starting point)
- h3 — more conservative, ~16% speedup
- h4 — very conservative, ~12% speedup
- adaptive — aggressive, 40-60%+ speedup (may degrade on tough configs)
Adjust protect_first_steps / protect_last_steps if needed (defaults are usually fine)

Quality & Safety

Validators: finite checks, magnitude clamp vs history, cosine vs last REAL epsilon
Learning stabilizer L: scales predicted epsilon by 1/L on skipped steps; updates on REAL steps only
Diagnostics: per-step timing + concise line showing σ targets, h/weights (where relevant), epsilon norms, x_rms, and [RISK]

Notes

h2/h3/h4 are conservative and deterministic; adaptive is aggressive and may show degradation on tough configs — validators and L minimize artifacts
Protect first/last windows guard early/late critical regions
Anchors and max consecutive skips are internal to adaptive to bound drift
Works with LoRAs, ControlNet, IP-Adapter
Since all equations are deterministic, running high skips will still produce very similar results as if ran with no skips meaning you can generate a lot more tests quicker before using a single seed for production

Troubleshooting

Getting artifacts or weird images?

Use skip_mode=none to verify baseline quality
Switch to h2 or h3 (more conservative than adaptive)
Increase protect_first_steps and protect_last_steps
Some sampler+scheduler combos produce issues even without skipping

Not seeing speedup?

FSampler needs history to extrapolate - works best with 10+ steps

FAQ

Q: Does this work with LoRAs/ControlNet/IP-Adapter? A: Yes! FSampler sits between the scheduler and sampler, so it's transparent to conditioning.

Q: Will this work on SDXL Turbo / LCM? A: Potentially, but low-step models (<10 steps) won't benefit much since there's less history to extrapolate from. Use explicit skip indices for precise control with low step counts.

Q: Can I use this with custom schedulers? A: Yes, FSampler works with any scheduler that produces sigma values.

Q: How does this compare to other speedup methods? A: FSampler is complementary to:

Distillation (LCM, Turbo): Use both together
Quantization: Use both together
Dynamic CFG: Use both together
FSampler specifically reduces sampling steps, not model inference cost

Tested Models

Flux (tested on 2080ti with LoRAs, f8 and f16 models)
Wan2.2
Qwen

Testing and feedback welcome on other models!