Instructions to use google/flan-t5-xxl with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use google/flan-t5-xxl with Transformers:
# Load model directly from transformers import AutoTokenizer, AutoModelForSeq2SeqLM tokenizer = AutoTokenizer.from_pretrained("google/flan-t5-xxl") model = AutoModelForSeq2SeqLM.from_pretrained("google/flan-t5-xxl") - Notebooks
- Google Colab
- Kaggle
Set decoder_start_token_id and output_past in config
#3
by ankrgyl - opened
Without the decoder_start_token_id parameter, you get the following ValueError while using the model:
561 elif (
562 hasattr(self.config, "decoder")
563 and hasattr(self.config.decoder, "bos_token_id")
564 and self.config.decoder.bos_token_id is not None
565 ):
566 return self.config.decoder.bos_token_id
--> 567 raise ValueError(
568 "`decoder_start_token_id` or `bos_token_id` has to be defined for encoder-decoder generation."
569 )
ValueError: `decoder_start_token_id` or `bos_token_id` has to be defined for encoder-decoder generation.
I checked https://huggingface.co/google/flan-t5-large/blob/main/config.json and noticed that output_past is also different.
I believe this fixes https://huggingface.co/google/flan-t5-xxl/discussions/2
This definitely fixes the issue, yes! Great catch! Thanks for spotting it - indeed we forgot to add it when porting the model :-)
ybelkada changed pull request status to merged