File size: 2,960 Bytes
6268841 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 | # TeaCache Acceleration
> **Note**: This is one of two caching strategies available in SGLang.
> For an overview of all caching options, see [caching](../index.md).
TeaCache (Temporal similarity-based caching) accelerates diffusion inference by detecting when consecutive denoising steps are similar enough to skip computation entirely.
## Overview
TeaCache works by:
1. Tracking the L1 distance between modulated inputs across consecutive timesteps
2. Accumulating the rescaled L1 distance over steps
3. When accumulated distance is below a threshold, reusing the cached residual
4. Supporting CFG (Classifier-Free Guidance) with separate positive/negative caches
## How It Works
### L1 Distance Tracking
At each denoising step, TeaCache computes the relative L1 distance between the current and previous modulated inputs:
```
rel_l1 = |current - previous|.mean() / |previous|.mean()
```
This distance is then rescaled using polynomial coefficients and accumulated:
```
accumulated += poly(coefficients)(rel_l1)
```
### Cache Decision
- If `accumulated >= threshold`: Force computation, reset accumulator
- If `accumulated < threshold`: Skip computation, use cached residual
### CFG Support
For models that support CFG cache separation (Wan, Hunyuan, Z-Image), TeaCache maintains separate caches for positive and negative branches:
- `previous_modulated_input` / `previous_residual` for positive branch
- `previous_modulated_input_negative` / `previous_residual_negative` for negative branch
For models that don't support CFG separation (Flux, Qwen), TeaCache is automatically disabled when CFG is enabled.
## Configuration
TeaCache is configured via `TeaCacheParams` in the sampling parameters:
```python
from sglang.multimodal_gen.configs.sample.teacache import TeaCacheParams
params = TeaCacheParams(
teacache_thresh=0.1, # Threshold for accumulated L1 distance
coefficients=[1.0, 0.0, 0.0], # Polynomial coefficients for L1 rescaling
)
```
### Parameters
| Parameter | Type | Description |
|-----------|------|-------------|
| `teacache_thresh` | float | Threshold for accumulated L1 distance. Lower = more caching, faster but potentially lower quality |
| `coefficients` | list[float] | Polynomial coefficients for L1 rescaling. Model-specific tuning |
### Model-Specific Configurations
Different models may have different optimal configurations. The coefficients are typically tuned per-model to balance speed and quality.
## Supported Models
TeaCache is built into the following model families:
| Model Family | CFG Cache Separation | Notes |
|--------------|---------------------|-------|
| Wan (wan2.1, wan2.2) | Yes | Full support |
| Hunyuan (HunyuanVideo) | Yes | To be supported |
| Z-Image | Yes | To be supported |
| Flux | No | To be supported |
| Qwen | No | To be supported |
## References
- [TeaCache: Accelerating Diffusion Models with Temporal Similarity](https://arxiv.org/abs/2411.14324)
|