# TeaCache Acceleration > **Note**: This is one of two caching strategies available in SGLang. > For an overview of all caching options, see [caching](../index.md). TeaCache (Temporal similarity-based caching) accelerates diffusion inference by detecting when consecutive denoising steps are similar enough to skip computation entirely. ## Overview TeaCache works by: 1. Tracking the L1 distance between modulated inputs across consecutive timesteps 2. Accumulating the rescaled L1 distance over steps 3. When accumulated distance is below a threshold, reusing the cached residual 4. Supporting CFG (Classifier-Free Guidance) with separate positive/negative caches ## How It Works ### L1 Distance Tracking At each denoising step, TeaCache computes the relative L1 distance between the current and previous modulated inputs: ``` rel_l1 = |current - previous|.mean() / |previous|.mean() ``` This distance is then rescaled using polynomial coefficients and accumulated: ``` accumulated += poly(coefficients)(rel_l1) ``` ### Cache Decision - If `accumulated >= threshold`: Force computation, reset accumulator - If `accumulated < threshold`: Skip computation, use cached residual ### CFG Support For models that support CFG cache separation (Wan, Hunyuan, Z-Image), TeaCache maintains separate caches for positive and negative branches: - `previous_modulated_input` / `previous_residual` for positive branch - `previous_modulated_input_negative` / `previous_residual_negative` for negative branch For models that don't support CFG separation (Flux, Qwen), TeaCache is automatically disabled when CFG is enabled. ## Configuration TeaCache is configured via `TeaCacheParams` in the sampling parameters: ```python from sglang.multimodal_gen.configs.sample.teacache import TeaCacheParams params = TeaCacheParams( teacache_thresh=0.1, # Threshold for accumulated L1 distance coefficients=[1.0, 0.0, 0.0], # Polynomial coefficients for L1 rescaling ) ``` ### Parameters | Parameter | Type | Description | |-----------|------|-------------| | `teacache_thresh` | float | Threshold for accumulated L1 distance. Lower = more caching, faster but potentially lower quality | | `coefficients` | list[float] | Polynomial coefficients for L1 rescaling. Model-specific tuning | ### Model-Specific Configurations Different models may have different optimal configurations. The coefficients are typically tuned per-model to balance speed and quality. ## Supported Models TeaCache is built into the following model families: | Model Family | CFG Cache Separation | Notes | |--------------|---------------------|-------| | Wan (wan2.1, wan2.2) | Yes | Full support | | Hunyuan (HunyuanVideo) | Yes | To be supported | | Z-Image | Yes | To be supported | | Flux | No | To be supported | | Qwen | No | To be supported | ## References - [TeaCache: Accelerating Diffusion Models with Temporal Similarity](https://arxiv.org/abs/2411.14324)