Spaces:
Sleeping
Sleeping
Add flash-attn support for H100 acceleration
Browse files- Add flash-attn>=2.5.0 for faster attention computation
- Pre-built wheels available for CUDA 12.1 + PyTorch 2.5
- Optimized for H100 Hopper architecture on HF Spaces
- Will significantly speed up Optimus (ViT) inference
- requirements.txt +1 -0
requirements.txt
CHANGED
|
@@ -1,6 +1,7 @@
|
|
| 1 |
--extra-index-url https://download.pytorch.org/whl/cu121
|
| 2 |
torch>=2.0.0,<2.6
|
| 3 |
torchvision>=0.15.0
|
|
|
|
| 4 |
open-clip-torch
|
| 5 |
gradio>=5.49.0
|
| 6 |
loguru>=0.7.3
|
|
|
|
| 1 |
--extra-index-url https://download.pytorch.org/whl/cu121
|
| 2 |
torch>=2.0.0,<2.6
|
| 3 |
torchvision>=0.15.0
|
| 4 |
+
flash-attn>=2.5.0
|
| 5 |
open-clip-torch
|
| 6 |
gradio>=5.49.0
|
| 7 |
loguru>=0.7.3
|