File size: 4,460 Bytes

---
license: apache-2.0
base_model:
- Wan-AI/Wan2.2-T2V-A14B-Diffusers
base_model_relation: quantized
pipeline_tag: text-to-video
---


# Elastic model: Fastest self-serving models. Wan 2.2

Elastic models are the models produced by TheStage AI ANNA: Automated Neural Networks Accelerator. ANNA allows you to control model size, latency and quality with a simple slider movement. For each model, ANNA produces a series of optimized models:

* __S__: The fastest model, with accuracy degradation less than 2%.


__Goals of Elastic Models:__

* Provide the fastest models and service for self-hosting.
* Provide flexibility in cost vs quality selection for inference.
* Provide clear quality and latency benchmarks.
* Provide interface of HF libraries: transformers and diffusers with a single line of code.
* Provide models supported on a wide range of hardware, which are pre-compiled and require no JIT.

> It's important to note that specific quality degradation can vary from model to model. For instance, with an S model, you can have 0.5% degradation as well.

-----
Prompt: Massive ocean waves violently crashing and shattering against jagged rocky cliffs during an intense storm with lightning flashes

Resolution: 480x480, Number of frames: 81

| S | Original |
|:-:|:-:|
| https://cdn-uploads.huggingface.co/production/uploads/6799fc8e150f5a4014b030ca/7Z1gFce9lMkOfFKrc8UUk.mp4 | https://cdn-uploads.huggingface.co/production/uploads/6799fc8e150f5a4014b030ca/b-JwMpD8LhbbvUdU2kjFE.mp4 |

## Inference

> Compiled versions are currently available only for 81-frame generations at 480x480 resolution. Other versions are not yet accessible. Stay tuned for updates!

To infer our models, you just need to replace `diffusers` import with `elastic_models.diffusers`:


```python
import torch
from elastic_models.diffusers import WanPipeline
from diffusers.utils import export_to_video

model_name = "Wan-AI/Wan2.2-T2V-A14B-Diffusers"
device = torch.device("cuda")
dtype = torch.bfloat16

pipe = WanPipeline.from_pretrained(
    model_name,
    torch_dtype=dtype,
    mode="S"
)
pipe.vae.enable_tiling()
pipe.vae.enable_slicing()
pipe.to(device)

prompt = "A beautiful woman in a red dress dancing"

with torch.no_grad():
    output = pipe(
        prompt=prompt,
        negative_prompt="",
        height=480,
        width=480,
        num_frames=81,
        num_inference_steps=40,
        guidance_scale=3.0,
        guidance_scale_2=2.0,
        generator=torch.Generator("cuda").manual_seed(42),
    )

    video = output.frames[0]
    export_to_video(video, "wan_output.mp4", fps=16)
```

### Installation


__System requirements:__
* GPUs: H100
* CPU: AMD, Intel
* Python: 3.10-3.12


To work with our models just run these lines in your terminal:

```shell
pip install thestage
pip install 'thestage-elastic-models[nvidia]' --extra-index-url https://thestage.jfrog.io/artifactory/api/pypi/pypi-thestage-ai-production/simple
pip install transformers==4.52.3
pip install diffusers==0.35.1

pip install flash_attn==2.7.3 --no-build-isolation
pip uninstall apex
pip install tensorrt==10.11.0.33 opencv-python==4.11.0.86 imageio-ffmpeg==0.6.0
```

Then go to [app.thestage.ai](https://app.thestage.ai), login and generate API token from your profile page. Set up API token as follows:

```shell
thestage config set --api-token <YOUR_API_TOKEN>
```

Congrats, now you can use accelerated models!

----

## Benchmarks

Benchmarking is one of the most important procedures during model acceleration. We aim to provide clear performance metrics for models using our algorithms.

### Quality benchmarks

We used a benchmark VBench (https://github.com/Vchitect/VBench) to evaluate the quality.

| Metric  | S   | Original |
|----------|-----|----------|
| Subject Consistency     | 0.96  | 0.96     |
| Background Consistency     | 0.96  | 0.96     |
| Motion Smoothness     | 0.98  | 0.98     |
| Dynamic Degree     | 0.29  | 0.29     |
| Aesthetic Quality     | 0.62  | 0.62     |
| Imaging Quality     | 0.68  | 0.68     |

### Latency benchmarks

Time in seconds of generation for 480x480 resolution, 81 frames.


| GPU  | S   | Original |
|----------|-----|----------|
| H100     | 90  | 180      |


## Links

* __Platform__: [app.thestage.ai](https://app.thestage.ai)
<!-- * __Elastic models Github__: [app.thestage.ai](app.thestage.ai) -->
* __Subscribe for updates__: [TheStageAI X](https://x.com/TheStageAI)
* __Contact email__: contact@thestage.ai