niknah
/

flash-attention-windows-wheel

Model card Files Files and versions

niknah commited on Nov 30, 2025

Commit

3b12de1

·

verified ·

1 Parent(s): ce99420

Create README.md

Files changed (1) hide show

README.md +17 -0

README.md ADDED Viewed

	@@ -0,0 +1,17 @@

+## To compile flash attention for windows...
+* Install visual studio build tools.
+* Start a new windows terminal.  Click "+" -> "developer command prompt for VS"
+* `set CUDA_HOME=%CUDA_PATH_V13_0%`  Or whatever your version of CUDA is.
+* Follow instructions in  https://huggingface.co/lldacing/flash-attention-windows-wheel
+* Leave your computer on overnight to compile it.  (Might run faster if you change MAX_JOBS= in the script)
+After upgrading major things like CUDA, python, you will need to upgrade just about everything else.  If you upgrade to something just released recently, you may need the git version.
+I have included sageattention if you have problems installing it.  (https://github.com/thu-ml/SageAttention/issues/242)
+You may also need to install xformers from the latest...
+`pip uninstall xformers`
+`pip install -v --no-build-isolation -U git+https://github.com/facebookresearch/xformers.git@main#egg=xformers`