Instructions to use google/flan-t5-large with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use google/flan-t5-large with Transformers:
# Load model directly from transformers import AutoTokenizer, AutoModelForSeq2SeqLM tokenizer = AutoTokenizer.from_pretrained("google/flan-t5-large") model = AutoModelForSeq2SeqLM.from_pretrained("google/flan-t5-large") - Notebooks
- Google Colab
- Kaggle
Use correct `gelu` function
#5
by ybelkada - opened
related discussion: https://huggingface.co/google/flan-t5-xxl/discussions/11
The previous config file was using gelu function instead of gated-gelu that is automatically set when forcing is_gated_actto True, more specifically here
This is not a breaking change since it fixes only for inference. Users that trained a model with gelu instead of gated-gelu should not be affected by this change. Note that using gated-gelu instead of gelu can give slightly different qualitative results but does not affect the overall performance of the model.
ybelkada changed pull request status to merged