Instructions to use kernels-community/vllm-flash-attn3 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Kernels
How to use kernels-community/vllm-flash-attn3 with Kernels:
# !pip install kernels from kernels import get_kernel kernel = get_kernel("kernels-community/vllm-flash-attn3") - Notebooks
- Google Colab
- Kaggle
Update README.md
#6
by sergiopaniego HF Staff - opened
README.md
CHANGED
|
@@ -1,5 +1,7 @@
|
|
| 1 |
---
|
| 2 |
license: apache-2.0
|
|
|
|
|
|
|
| 3 |
---
|
| 4 |
|
| 5 |
# vllm-flash-attn3
|
|
@@ -89,4 +91,4 @@ This will automatically resolve and download the appropriate code for your archi
|
|
| 89 |
|
| 90 |
- [Tri Dao](https://huggingface.co/tridao) and team for Flash Attention and [Flash Attention 3](https://tridao.me/blog/2024/flash3/).
|
| 91 |
- The [vLLM team](https://huggingface.co/vllm-project) for their implementation and their contribution of attention sinks.
|
| 92 |
-
- The [transformers team](https://huggingface.co/transformers-community) for packaging, testing, building and making it available for use with the [kernels library](https://github.com/huggingface/kernels).
|
|
|
|
| 1 |
---
|
| 2 |
license: apache-2.0
|
| 3 |
+
tags:
|
| 4 |
+
- kernel
|
| 5 |
---
|
| 6 |
|
| 7 |
# vllm-flash-attn3
|
|
|
|
| 91 |
|
| 92 |
- [Tri Dao](https://huggingface.co/tridao) and team for Flash Attention and [Flash Attention 3](https://tridao.me/blog/2024/flash3/).
|
| 93 |
- The [vLLM team](https://huggingface.co/vllm-project) for their implementation and their contribution of attention sinks.
|
| 94 |
+
- The [transformers team](https://huggingface.co/transformers-community) for packaging, testing, building and making it available for use with the [kernels library](https://github.com/huggingface/kernels).
|