legacy-datasets/wikipedia
Updated • 121k • 629
How to use SzegedAI/hubertusz-medium-wiki-seq128 with Transformers:
# Load model directly
from transformers import AutoTokenizer, AutoModelForPreTraining
tokenizer = AutoTokenizer.from_pretrained("SzegedAI/hubertusz-medium-wiki-seq128")
model = AutoModelForPreTraining.from_pretrained("SzegedAI/hubertusz-medium-wiki-seq128")Fully trained model with the second phase of training is available here: SzegedAI/hubert-medium-wiki
This model was trained from scratch on the Wikipedia subset of Hungarian Webcorpus 2.0 with MLM and SOP tasks.