legacy-datasets/wikipedia
Updated • 124k • 632
How to use SzegedAI/hubertusz-tiny-wiki-seq128 with Transformers:
# Load model directly
from transformers import AutoTokenizer, AutoModelForPreTraining
tokenizer = AutoTokenizer.from_pretrained("SzegedAI/hubertusz-tiny-wiki-seq128")
model = AutoModelForPreTraining.from_pretrained("SzegedAI/hubertusz-tiny-wiki-seq128")Fully trained model with the second phase of training is available here: SzegedAI/hubert-tiny-wiki
This model was trained from scratch on the Wikipedia subset of Hungarian Webcorpus 2.0 with MLM and SOP tasks.
# Load model directly from transformers import AutoTokenizer, AutoModelForPreTraining tokenizer = AutoTokenizer.from_pretrained("SzegedAI/hubertusz-tiny-wiki-seq128") model = AutoModelForPreTraining.from_pretrained("SzegedAI/hubertusz-tiny-wiki-seq128")