Spaces:
Runtime error
A newer version of the Gradio SDK is available: 6.14.0
Wan2.2 Benchmark
Executive Summary
This report presents a comprehensive performance evaluation of the Athena framework compared to the baseline LightX2V framework. The benchmarks were conducted using the Wan2.2-TI2V-5B model on NVIDIA H100 hardware.
π―Test Environment & Versioning
Hardware & Settings
| Parameter | Value |
|---|---|
| Hardware | NVIDIA H100 |
| Model | Wan2.2-TI2V-5B |
| Precision | torch.bfloat16 |
| Inference Steps | 50 |
| Resolution | 704 Γ 1280(720p) |
| FPS | 24 |
| CFG | Enabled |
Software Versioning
To ensure reproducibility, the following specific commits were used for this benchmark:
π Performance Benchmarks
π We compared the iteration speed (seconds per iteration) between Athena and LightX2V across three distinct Context Parallel (CP) configurations.
| Configuration | Frames | LightX2V (s/it) | Athena (s/it) | Speedup |
|---|---|---|---|---|
| CP1 | 121 | 1.928 | 1.69 | 1.14x π |
| CP2 | 121 | 1.197 | 1.06 | 1.13x π |
| CP4 | 241 | 1.767 | 1.32 | 1.34x π |
| CP8 | 241 | 1.507 | 1.35 | 1.12x π |
π‘ Reproduction Guide
To reproduce the results presented in this report, follow the steps below using the specified commit hashes.
Setup
git clone https://github.com/world-sim-dev/athena
cd athena
git checkout f676ae6
pip install -r requirements.txt
# Clone and install LightX2V (for baseline comparison)
git clone https://github.com/ModelTC/LightX2V
cd lightx2v
git checkout 33f0f67
pip install -r requirements.txt
Running Benchmarks
For Athena, run:
bash ./scripts/run_wan2_2_ti2v_i2v.sh
For LightX2V: Clone the scripts from Benchmark for LightX2V and run:
git clone https://gist.github.com/wtr0504/629388f17ed38d1c12d5ef5c25a15197
bash run_wan.sh
π MagiCompiler Optimization Methodology
Whole Graph Compilation
Constant Folding & Dead Code Elimination: Streamlining the computation graph prior to execution.
Coarse-grained Kernel Fusion
MagiCompiler aggregates multiple smaller operators into larger, fused kernels. This optimization is critical for efficient execution on the GPU.
All to All Communication
MagiCompiler Uses all_to_all_single (1 communication op per sync point) while LightX2V Uses all_to_all x 3 (3 separate communication ops).