# FSampler - Fast Diffusion Sampling via Epsilon Extrapolation

FSampler is a training-free, sampler-agnostic acceleration layer for diffusion sampling that reduces model calls by predicting each step's epsilon (noise) from recent real calls and feeding it into the existing integrator.

## NOTE

**Important Compatibility Information:**
- The RES family samplers will not produce 1:1 parity with the official KSampler or ClownShark KSampler implementations- despite having 1:1 code
- Even ComfyUI vs ClownShark produce different results, due to environment variables and implementation details
- Simple samplers like Euler will have 1:1 parity across implementations

## Features

- **Training-Free Acceleration**: Skips full model calls using predicted epsilon from recent steps
- **Sampler Agnostic**: Works with Euler, RES 2M/2S, DDIM, DPM++ 2M/2S, LMS, and more
- **Multiple Skip Modes**:
  - Fixed modes (h2/h3/h4): Conservative, deterministic speedup
  - Adaptive mode: Aggressive skipping for 40-60%+ speedup
  - Skip indices: Manually pick which steps to skip (useful for low step counts)
- **Built-in Stability**: Universal learning stabilizer and validators prevent artifacts

## Skip Modes

- **none**: baseline (no skipping)
- **hN/sK**: h=history used for predictor, s=steps/calls before skip
  - **h2/s2..s5**: linear predictor; common picks h2/s2 (~24%) or h2/s3 (~20%+)
  - **h3/s3..s5**: Richardson; common picks h3/s3 (~16%) or h3/s4 (~12%+)
  - **h4/s4..s5**: cubic; conservative, quality-sensitive; typically h4/s4
- **adaptive**: aggressive skip gate using two predictors (h3 vs h2) in predicted-state space

### Skip Indices
- For low step count workflows, use skip indices to manually pick which steps to skip
- Gives you precise control over the sampling process

## Usage

**For quick usage start with FSampler (simple) rather than FSampler Advanced** - the simple version only needs noise and skip mode to operate. Swap with your normal KSampler node.

1. Add the **FSampler** node (or **FSampler Advanced** for more control)
2. Choose your **sampler** and **scheduler** as usual
3. Set **skip_mode**:
   - `none` — baseline (no skipping, use this first to validate)
   - `h2` — conservative, ~20-30% speedup (recommended starting point)
   - `h3` — more conservative, ~16% speedup
   - `h4` — very conservative, ~12% speedup
   - `adaptive` — aggressive, 40-60%+ speedup (may degrade on tough configs)
4. Adjust **protect_first_steps** / **protect_last_steps** if needed (defaults are usually fine)

## Quality & Safety

- **Validators**: finite checks, magnitude clamp vs history, cosine vs last REAL epsilon
- **Learning stabilizer L**: scales predicted epsilon by 1/L on skipped steps; updates on REAL steps only
- **Diagnostics**: per-step timing + concise line showing σ targets, h/weights (where relevant), epsilon norms, x_rms, and [RISK]

## Notes

- h2/h3/h4 are conservative and deterministic; adaptive is aggressive and may show degradation on tough configs — validators and L minimize artifacts
- Protect first/last windows guard early/late critical regions
- Anchors and max consecutive skips are internal to adaptive to bound drift
- Works with LoRAs, ControlNet, IP-Adapter
- Since all equations are deterministic, running high skips will still produce very similar results as if ran with no skips meaning you can generate a lot more tests quicker before using a single seed for production

## Troubleshooting

### Getting artifacts or weird images?
1. Use `skip_mode=none` to verify baseline quality
2. Switch to `h2` or `h3` (more conservative than adaptive)
3. Increase `protect_first_steps` and `protect_last_steps`
4. Some sampler+scheduler combos produce issues even without skipping

### Not seeing speedup?
- FSampler needs history to extrapolate - works best with 10+ steps

## FAQ

**Q: Does this work with LoRAs/ControlNet/IP-Adapter?**
A: Yes! FSampler sits between the scheduler and sampler, so it's transparent to conditioning.

**Q: Will this work on SDXL Turbo / LCM?**
A: Potentially, but low-step models (<10 steps) won't benefit much since there's less history to extrapolate from. Use explicit skip indices for precise control with low step counts.

**Q: Can I use this with custom schedulers?**
A: Yes, FSampler works with any scheduler that produces sigma values.

**Q: How does this compare to other speedup methods?**
A: FSampler is complementary to:
- **Distillation** (LCM, Turbo): Use both together
- **Quantization**: Use both together
- **Dynamic CFG**: Use both together
- FSampler specifically reduces *sampling steps*, not model inference cost

## Tested Models

- Flux (tested on 2080ti with LoRAs, f8 and f16 models)
- Wan2.2
- Qwen

Testing and feedback welcome on other models!