Upload README.md with huggingface_hub
Browse files
README.md
CHANGED
|
@@ -11,7 +11,12 @@ FastESM is a Huggingface compatible plug in version of ESM2 rewritten with a new
|
|
| 11 |
|
| 12 |
Load any ESM2 models into a FastEsm model to dramatically speed up training and inference without **ANY** cost in performance.
|
| 13 |
|
| 14 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 15 |
Various other optimizations also make the base implementation slightly different than the one in transformers.
|
| 16 |
|
| 17 |
## Use with 🤗 transformers
|
|
|
|
| 11 |
|
| 12 |
Load any ESM2 models into a FastEsm model to dramatically speed up training and inference without **ANY** cost in performance.
|
| 13 |
|
| 14 |
+
## Attention backend defaults
|
| 15 |
+
Flex Attention with a block mask that ignores pad tokens is the default attention backend. If Flex Attention is unavailable, FastESM falls back to native PyTorch attention.
|
| 16 |
+
|
| 17 |
+
For throughput and memory efficiency, `torch.compile(...)` is heavily recommended, especially when using Flex Attention.
|
| 18 |
+
|
| 19 |
+
Outputting attention maps (or the contact prediction head) is not natively possible with the optimized attention backends (including Flex Attention). You can still pass ```output_attentions``` to have attention calculated manually and returned.
|
| 20 |
Various other optimizations also make the base implementation slightly different than the one in transformers.
|
| 21 |
|
| 22 |
## Use with 🤗 transformers
|