legacy-datasets/mc4
Updated • 2.03k • 153
How to use bakrianoo/t5-arabic-base with Transformers:
# Load model directly
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
tokenizer = AutoTokenizer.from_pretrained("bakrianoo/t5-arabic-base")
model = AutoModelForSeq2SeqLM.from_pretrained("bakrianoo/t5-arabic-base")YAML Metadata Error:"language" must only contain lowercase characters
YAML Metadata Error:"language" with value "Arabic" is not valid. It must be an ISO 639-1, 639-2 or 639-3 code (two/three letters), or a special value like "code", "multilingual". If you want to use BCP-47 identifiers, you can specify them in language_bcp47.
A customized T5 Model for Arabic and English Task. It could be used as an alternative for google/mt5-base model, as it's much smaller and only targets Arabic and English based tasks.
T5 is an encoder-decoder model pre-trained on a multi-task mixture of unsupervised and supervised tasks and for which each task is converted into a text-to-text format.
The T5 model was presented in Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer by Colin Raffel, Noam Shazeer, Adam Roberts, Katherine Lee, Sharan Narang, Michael Matena, Yanqi Zhou, Wei Li, Peter J. Liu.
# Load model directly from transformers import AutoTokenizer, AutoModelForSeq2SeqLM tokenizer = AutoTokenizer.from_pretrained("bakrianoo/t5-arabic-base") model = AutoModelForSeq2SeqLM.from_pretrained("bakrianoo/t5-arabic-base")