Create README.md
Browse files
README.md
ADDED
|
@@ -0,0 +1,17 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
## To compile flash attention for windows...
|
| 2 |
+
|
| 3 |
+
* Install visual studio build tools.
|
| 4 |
+
* Start a new windows terminal. Click "+" -> "developer command prompt for VS"
|
| 5 |
+
* `set CUDA_HOME=%CUDA_PATH_V13_0%` Or whatever your version of CUDA is.
|
| 6 |
+
* Follow instructions in https://huggingface.co/lldacing/flash-attention-windows-wheel
|
| 7 |
+
* Leave your computer on overnight to compile it. (Might run faster if you change MAX_JOBS= in the script)
|
| 8 |
+
|
| 9 |
+
After upgrading major things like CUDA, python, you will need to upgrade just about everything else. If you upgrade to something just released recently, you may need the git version.
|
| 10 |
+
|
| 11 |
+
|
| 12 |
+
I have included sageattention if you have problems installing it. (https://github.com/thu-ml/SageAttention/issues/242)
|
| 13 |
+
|
| 14 |
+
You may also need to install xformers from the latest...
|
| 15 |
+
|
| 16 |
+
`pip uninstall xformers`
|
| 17 |
+
`pip install -v --no-build-isolation -U git+https://github.com/facebookresearch/xformers.git@main#egg=xformers`
|