Instructions to use Synthyra/FastESM2_650 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use Synthyra/FastESM2_650 with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("fill-mask", model="Synthyra/FastESM2_650", trust_remote_code=True)# Load model directly from transformers import AutoModelForMaskedLM model = AutoModelForMaskedLM.from_pretrained("Synthyra/FastESM2_650", trust_remote_code=True, dtype="auto") - Notebooks
- Google Colab
- Kaggle
Upload README.md with huggingface_hub
Browse files
README.md
CHANGED
|
@@ -11,7 +11,7 @@ FastESM is a Huggingface compatible plug in version of ESM2 rewritten with a new
|
|
| 11 |
|
| 12 |
Load any ESM2 models into a FastEsm model to dramatically speed up training and inference without **ANY** cost in performance.
|
| 13 |
|
| 14 |
-
|
| 15 |
Various other optimizations also make the base implementation slightly different than the one in transformers.
|
| 16 |
|
| 17 |
# FastESM2-650
|
|
|
|
| 11 |
|
| 12 |
Load any ESM2 models into a FastEsm model to dramatically speed up training and inference without **ANY** cost in performance.
|
| 13 |
|
| 14 |
+
The default attention backend is `sdpa`. See the [FastPLMs README](https://github.com/Synthyra/FastPLMs) for a full breakdown of available backends (`sdpa`, `kernels_flash`, `flex`, `auto`) and how to switch between them. Attention maps (`output_attentions=True`) are supported on all backends via a separate naive computation.
|
| 15 |
Various other optimizations also make the base implementation slightly different than the one in transformers.
|
| 16 |
|
| 17 |
# FastESM2-650
|