daVinci-MagiHuman / pkgs /MagiCompiler /docs /Wan2.2Benchmark.md
jiadisu
Switch back to Docker SDK with local pkgs
e6066e8
## Wan2.2 Benchmark
### Executive Summary
This report presents a comprehensive performance evaluation of the **[Athena](https://github.com/world-sim-dev/athena)** framework compared to the baseline **[LightX2V](https://github.com/ModelTC/LightX2V)** framework. The benchmarks were conducted using the **[Wan2.2-TI2V-5B](https://huggingface.co/Wan-AI)** model on NVIDIA H100 hardware.
---
### 🎯Test Environment & Versioning
#### Hardware & Settings
| Parameter | Value |
| ------------------- | -------------- |
| Hardware | NVIDIA H100 |
| Model | Wan2.2-TI2V-5B |
| Precision | torch.bfloat16 |
| Inference Steps | 50 |
| Resolution | 704 Γ— 1280(720p)|
| FPS | 24 |
| CFG | Enabled |
#### Software Versioning
To ensure reproducibility, the following specific commits were used for this benchmark:
| Framework | Branch / Tag | Commit |
| --------- | ------------ | ------ |
| Athena | main|[f676ae6](https://github.com/world-sim-dev/athena/commit/f676ae64ad2fc581289d1c3ae5eb51c15ce76f1d) |
| LightX2V | main | [33f0f67](https://github.com/ModelTC/LightX2V/commit/33f0f67f4ecdff86b1db676d3e0786628cc31c7b) |
### πŸ† Performance Benchmarks
πŸ“Š We compared the iteration speed (seconds per iteration) between Athena and LightX2V across three distinct Context Parallel (CP) configurations.
| Configuration | Frames | LightX2V (s/it) | Athena (s/it) | Speedup |
| ------------- | ------ | -------------- | -------------- | ------- |
| CP1 | 121 | 1.928 | **1.69** | **1.14x** πŸš€|
| CP2 | 121 | 1.197 | **1.06** | **1.13x** πŸš€|
| CP4 | 241 | 1.767 | **1.32** | **1.34x** πŸš€|
| CP8 | 241 | 1.507 | **1.35** | **1.12x** πŸš€|
---
### πŸ’‘ Reproduction Guide
To reproduce the results presented in this report, follow the steps below using the specified commit hashes.
#### Setup
```bash
git clone https://github.com/world-sim-dev/athena
cd athena
git checkout f676ae6
pip install -r requirements.txt
# Clone and install LightX2V (for baseline comparison)
git clone https://github.com/ModelTC/LightX2V
cd lightx2v
git checkout 33f0f67
pip install -r requirements.txt
```
#### Running Benchmarks
For Athena, run:
```
bash ./scripts/run_wan2_2_ti2v_i2v.sh
```
For LightX2V:
Clone the scripts from [Benchmark for LightX2V](https://gist.github.com/wtr0504/629388f17ed38d1c12d5ef5c25a15197) and run:
```
git clone https://gist.github.com/wtr0504/629388f17ed38d1c12d5ef5c25a15197
bash run_wan.sh
```
### πŸ”Ž MagiCompiler Optimization Methodology
**Whole Graph Compilation**
Constant Folding & Dead Code Elimination: Streamlining the computation graph prior to execution.
**Coarse-grained Kernel Fusion**
MagiCompiler aggregates multiple smaller operators into larger, fused kernels. This optimization is critical for efficient execution on the GPU.
**All to All Communication**
MagiCompiler Uses ``all_to_all_single`` (1 communication op per sync point) while LightX2V Uses all_to_all x 3 (3 separate communication ops).