441 MB
2 files
Updated 30 days ago
NameSize
README.md892 Bytes
xet
flash_attn_3-3.0.0-cp39-abi3-linux_x86_64.whl441 MB
xet
README.md

Flash Attention 3 - Precompiled Wheel

A precompiled wheel for Flash Attention 3 (v3.0.0). You can install it directly via pip

Compatibility

To use this wheel, your environment must match the following exact specifications:

  • GPU: NVIDIA Hopper Architecture ONLY (H100, H200, GH200) sm_90a
  • OS: Linux (x86_64)
  • Python: 3.9, 3.10, 3.11, 3.12, or newer (abi3 compatible)
  • PyTorch: 2.9.x
  • CUDA: 12.8

How to Install

Download the .whl file from the assets below, or install it directly via URL:

# Example pip install
pip install https://github.com/aw920h/flash-attn-3-wheels/releases/download/v3.0.0-torch2.9.1-cu128/flash_attn_3-3.0.0-cp39-abi3-linux_x86_64.whl

Build Environment Details

  • Base Repo: Official Flash Attention hopper branch
  • Torch Version: 2.9.1+cu128
  • NVCC Version: Built using internal PTXAS 12.8.93
Total size
441 MB
Files
2
Last updated
May 1
Pre-warmed CDN
US EU US EU

Contributors