Buckets:
441 MB
2 files
Updated 30 days ago
Ctrl+K
| Name | Size | Uploaded | Xet hash |
|---|---|---|---|
| README.md | 892 Bytes xet | 82c0f2cb | |
| flash_attn_3-3.0.0-cp39-abi3-linux_x86_64.whl | 441 MB xet | e68b8e56 |
Flash Attention 3 - Precompiled Wheel
A precompiled wheel for Flash Attention 3 (v3.0.0). You can install it directly via pip
Compatibility
To use this wheel, your environment must match the following exact specifications:
- GPU: NVIDIA Hopper Architecture ONLY (H100, H200, GH200)
sm_90a - OS: Linux (x86_64)
- Python: 3.9, 3.10, 3.11, 3.12, or newer (
abi3compatible) - PyTorch:
2.9.x - CUDA:
12.8
How to Install
Download the .whl file from the assets below, or install it directly via URL:
# Example pip install
pip install https://github.com/aw920h/flash-attn-3-wheels/releases/download/v3.0.0-torch2.9.1-cu128/flash_attn_3-3.0.0-cp39-abi3-linux_x86_64.whl
Build Environment Details
- Base Repo: Official Flash Attention hopper branch
- Torch Version: 2.9.1+cu128
- NVCC Version: Built using internal PTXAS 12.8.93
- Total size
- 441 MB
- Files
- 2
- Last updated
- May 1
- Pre-warmed CDN
- US EU US EU