Update README.md
Browse files
README.md
CHANGED
|
@@ -79,15 +79,6 @@ print(flash_attn_func(q, q, q).shape)
|
|
| 79 |
# Expected: torch.Size([1, 8, 128, 64])
|
| 80 |
```
|
| 81 |
|
| 82 |
-
## Notes on Flash Attention 3 / 4 for RTX 5090
|
| 83 |
-
|
| 84 |
-
Neither FA3 nor FA4 will run on consumer Blackwell GPUs (sm_120):
|
| 85 |
-
|
| 86 |
-
- **FA3** is Hopper-only (sm_90), built around hardware features the RTX 5090 doesn't have.
|
| 87 |
-
- **FA4** requires the TMEM (Tensor Memory) subsystem present only on datacenter Blackwell (sm_100/sm_103, e.g., B200/B300). Consumer Blackwell (sm_120) lacks TMEM.
|
| 88 |
-
|
| 89 |
-
For RTX 5090 and other sm_120 cards, **Flash Attention 2 is the correct target**. If FA3/FA4 ever gain sm_120 support, wheels will be added here.
|
| 90 |
-
|
| 91 |
## Build Environment
|
| 92 |
|
| 93 |
- **OS**: Windows 11 (64-bit)
|
|
|
|
| 79 |
# Expected: torch.Size([1, 8, 128, 64])
|
| 80 |
```
|
| 81 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 82 |
## Build Environment
|
| 83 |
|
| 84 |
- **OS**: Windows 11 (64-bit)
|