Upload folder using huggingface_hub

c6535db verified 13 days ago

4.87 kB

	# FSampler - Fast Diffusion Sampling via Epsilon Extrapolation

	FSampler is a training-free, sampler-agnostic acceleration layer for diffusion sampling that reduces model calls by predicting each step's epsilon (noise) from recent real calls and feeding it into the existing integrator.

	## NOTE

	Important Compatibility Information:
	- The RES family samplers will not produce 1:1 parity with the official KSampler or ClownShark KSampler implementations- despite having 1:1 code
	- Even ComfyUI vs ClownShark produce different results, due to environment variables and implementation details
	- Simple samplers like Euler will have 1:1 parity across implementations

	## Features

	- Training-Free Acceleration: Skips full model calls using predicted epsilon from recent steps
	- Sampler Agnostic: Works with Euler, RES 2M/2S, DDIM, DPM++ 2M/2S, LMS, and more
	- Multiple Skip Modes:
	- Fixed modes (h2/h3/h4): Conservative, deterministic speedup
	- Adaptive mode: Aggressive skipping for 40-60%+ speedup
	- Skip indices: Manually pick which steps to skip (useful for low step counts)
	- Built-in Stability: Universal learning stabilizer and validators prevent artifacts

	## Skip Modes

	- none: baseline (no skipping)
	- hN/sK: h=history used for predictor, s=steps/calls before skip
	- h2/s2..s5: linear predictor; common picks h2/s2 (~24%) or h2/s3 (~20%+)
	- h3/s3..s5: Richardson; common picks h3/s3 (~16%) or h3/s4 (~12%+)
	- h4/s4..s5: cubic; conservative, quality-sensitive; typically h4/s4
	- adaptive: aggressive skip gate using two predictors (h3 vs h2) in predicted-state space

	### Skip Indices
	- For low step count workflows, use skip indices to manually pick which steps to skip
	- Gives you precise control over the sampling process

	## Usage

	For quick usage start with FSampler (simple) rather than FSampler Advanced - the simple version only needs noise and skip mode to operate. Swap with your normal KSampler node.

	1. Add the FSampler node (or FSampler Advanced for more control)
	2. Choose your sampler and scheduler as usual
	3. Set skip_mode:
	- `none` — baseline (no skipping, use this first to validate)
	- `h2` — conservative, ~20-30% speedup (recommended starting point)
	- `h3` — more conservative, ~16% speedup
	- `h4` — very conservative, ~12% speedup
	- `adaptive` — aggressive, 40-60%+ speedup (may degrade on tough configs)
	4. Adjust protect_first_steps / protect_last_steps if needed (defaults are usually fine)

	## Quality & Safety

	- Validators: finite checks, magnitude clamp vs history, cosine vs last REAL epsilon
	- Learning stabilizer L: scales predicted epsilon by 1/L on skipped steps; updates on REAL steps only
	- Diagnostics: per-step timing + concise line showing σ targets, h/weights (where relevant), epsilon norms, x_rms, and [RISK]

	## Notes

	- h2/h3/h4 are conservative and deterministic; adaptive is aggressive and may show degradation on tough configs — validators and L minimize artifacts
	- Protect first/last windows guard early/late critical regions
	- Anchors and max consecutive skips are internal to adaptive to bound drift
	- Works with LoRAs, ControlNet, IP-Adapter
	- Since all equations are deterministic, running high skips will still produce very similar results as if ran with no skips meaning you can generate a lot more tests quicker before using a single seed for production

	## Troubleshooting

	### Getting artifacts or weird images?
	1. Use `skip_mode=none` to verify baseline quality
	2. Switch to `h2` or `h3` (more conservative than adaptive)
	3. Increase `protect_first_steps` and `protect_last_steps`
	4. Some sampler+scheduler combos produce issues even without skipping

	### Not seeing speedup?
	- FSampler needs history to extrapolate - works best with 10+ steps

	## FAQ

	Q: Does this work with LoRAs/ControlNet/IP-Adapter?
	A: Yes! FSampler sits between the scheduler and sampler, so it's transparent to conditioning.

	Q: Will this work on SDXL Turbo / LCM?
	A: Potentially, but low-step models (<10 steps) won't benefit much since there's less history to extrapolate from. Use explicit skip indices for precise control with low step counts.

	Q: Can I use this with custom schedulers?
	A: Yes, FSampler works with any scheduler that produces sigma values.

	Q: How does this compare to other speedup methods?
	A: FSampler is complementary to:
	- Distillation (LCM, Turbo): Use both together
	- Quantization: Use both together
	- Dynamic CFG: Use both together
	- FSampler specifically reduces sampling steps, not model inference cost

	## Tested Models

	- Flux (tested on 2080ti with LoRAs, f8 and f16 models)
	- Wan2.2
	- Qwen

	Testing and feedback welcome on other models!