YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

To compile flash attention for windows...

  • Install visual studio build tools.
  • Start a new windows terminal. Click "+" -> "developer command prompt for VS"
  • set CUDA_HOME=%CUDA_PATH_V13_0% Or whatever your version of CUDA is.
  • Follow instructions in https://huggingface.co/lldacing/flash-attention-windows-wheel
  • Leave your computer on overnight to compile it. (Might run faster if you change MAX_JOBS= in the script)

After upgrading major things like CUDA, python, you will need to upgrade just about everything else. If you upgrade to something just released recently, you may need the git version.

I have included sageattention if you have problems installing it. (https://github.com/thu-ml/SageAttention/issues/242)

You may also need to install xformers from the latest...

pip uninstall xformers pip install -v --no-build-isolation -U git+https://github.com/facebookresearch/xformers.git@main#egg=xformers

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support