Instructions to use jinaai/jina-bert-flash-implementation with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use jinaai/jina-bert-flash-implementation with Transformers:
# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("jinaai/jina-bert-flash-implementation", trust_remote_code=True, dtype="auto") - Notebooks
- Google Colab
- Kaggle
fix-glu-mlp
#17
by michael-guenther - opened
The GluMLP is not working without flash attention, because the tensors are passed in a different shape. This PR fixes the issue. I also tested it that the embeddings with and without flash attentions are the same.
michael-guenther changed pull request status to open
LGTM!
michael-guenther changed pull request status to merged