raylim commited on
Commit
e38c7e8
·
unverified ·
1 Parent(s): 3f232ad

Add flash-attn support for H100 acceleration

Browse files

- Add flash-attn>=2.5.0 for faster attention computation
- Pre-built wheels available for CUDA 12.1 + PyTorch 2.5
- Optimized for H100 Hopper architecture on HF Spaces
- Will significantly speed up Optimus (ViT) inference

Files changed (1) hide show
  1. requirements.txt +1 -0
requirements.txt CHANGED
@@ -1,6 +1,7 @@
1
  --extra-index-url https://download.pytorch.org/whl/cu121
2
  torch>=2.0.0,<2.6
3
  torchvision>=0.15.0
 
4
  open-clip-torch
5
  gradio>=5.49.0
6
  loguru>=0.7.3
 
1
  --extra-index-url https://download.pytorch.org/whl/cu121
2
  torch>=2.0.0,<2.6
3
  torchvision>=0.15.0
4
+ flash-attn>=2.5.0
5
  open-clip-torch
6
  gradio>=5.49.0
7
  loguru>=0.7.3