polish-roberta-base-v2

An encoder model based on the RoBERTa architecture, pre-trained on a large corpus of Polish texts. More information can be found in our GitHub repository and in the publication Pre-training polish transformer-based language models at scale.

Citation

@inproceedings{dadas2020pre,
  title={Pre-training polish transformer-based language models at scale},
  author={Dadas, S{\l}awomir and Pere{\l}kiewicz, Micha{\l} and Po{\'s}wiata, Rafa{\l}},
  booktitle={International Conference on Artificial Intelligence and Soft Computing},
  pages={301--314},
  year={2020},
  organization={Springer}
}