Instructions to use google/flan-t5-xl with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use google/flan-t5-xl with Transformers:
# Load model directly from transformers import AutoTokenizer, AutoModelForSeq2SeqLM tokenizer = AutoTokenizer.from_pretrained("google/flan-t5-xl") model = AutoModelForSeq2SeqLM.from_pretrained("google/flan-t5-xl") - Notebooks
- Google Colab
- Kaggle
Giving multiple inputs to model.generate()
I am new to huggingface. I am using Pytorch for development. I have a query.
The model card for inference looks like this -
from transformers import T5Tokenizer, T5ForConditionalGeneration
tokenizer = T5Tokenizer.from_pretrained("google/flan-t5-xl")
model = T5ForConditionalGeneration.from_pretrained("google/flan-t5-xl", device_map="auto")
input_text = "translate English to German: How old are you?"
input_ids = tokenizer(input_text, return_tensors="pt").input_ids.to("cuda")
outputs = model.generate(input_ids)
print(tokenizer.decode(outputs[0]))
If I have a large list consisting of input_texts, how can I give them to the model.generate() function? Is there a way to perform this inference in batches?
Can someone provide code/references for this?