Instructions to use Siddharth63/biot5-large with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use Siddharth63/biot5-large with Transformers:
# Load model directly from transformers import AutoTokenizer, AutoModelForSeq2SeqLM tokenizer = AutoTokenizer.from_pretrained("Siddharth63/biot5-large") model = AutoModelForSeq2SeqLM.from_pretrained("Siddharth63/biot5-large") - Notebooks
- Google Colab
- Kaggle
# Load model directly
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
tokenizer = AutoTokenizer.from_pretrained("Siddharth63/biot5-large")
model = AutoModelForSeq2SeqLM.from_pretrained("Siddharth63/biot5-large")biot5-small
Model description
T5 is an encoder-decoder model and treats all NLP problems in a text-to-text format.
BioT5 is a transformers model pretrained on a very large corpus of biological data (25 million abstracts) in a self-supervised fashion. This means it was pretrained on the raw texts only, with no humans labelling them in any way (which is why it can use lots of publicly available data) with an automatic process to generate inputs and outputs from those texts.
This model used the T5 v1.1 improvements compared to the original T5 model during the pretraining:
GEGLU activation in feed-forward hidden layer, rather than ReLU - see here Dropout was turned off in pretraining (quality win). Dropout should be re-enabled during fine-tuning Pretrained on self-supervised objective only without mixing in the downstream tasks No parameter sharing between embedding and classifier layer
Acknowledgements
This project would not have been possible without compute generously provided by Google through the Google TPU Research Cloud. Thanks to Yeb Havinga and Gabriele Sarti for helping me get started with the t5x framework.
- Downloads last month
- -
# Gated model: Login with a HF token with gated access permission hf auth login