Comparison of Czech Transformers on Text Classification Tasks
Paper • 2107.10042 • Published
# Load model directly
from transformers import AutoTokenizer, AutoModelForMaskedLM
tokenizer = AutoTokenizer.from_pretrained("fav-kky/FERNET-CC_sk")
model = AutoModelForMaskedLM.from_pretrained("fav-kky/FERNET-CC_sk")FERNET-CC_sk is a monolingual Slovak BERT-base model pre-trained from 29GB of filtered Slovak Common Crawl dataset.
It is a Slovak version of our Czech FERNET-C5 model.
Preprint of our paper is available at https://arxiv.org/abs/2107.10042.
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("fill-mask", model="fav-kky/FERNET-CC_sk")