{ "model_id": "DeepPavlov/rubert-base-cased", "downloads": 285331, "tags": [ "transformers", "pytorch", "jax", "bert", "feature-extraction", "ru", "arxiv:1905.07213", "endpoints_compatible", "region:us" ], "description": "--- language: - ru --- # rubert-base-cased RuBERT \\(Russian, cased, 12‑layer, 768‑hidden, 12‑heads, 180M parameters\\) was trained on the Russian part of Wikipedia and news data. We used this training data to build a vocabulary of Russian subtokens and took a multilingual version of BERT‑base as an initialization for RuBERT\\[1\\]. 08.11.2021: upload model with MLM and NSP heads \\[1\\]: Kuratov, Y., Arkhipov, M. \\(2019\\). Adaptation of Deep Bidirectional Multilingual Transformers for Russian Language. arXiv preprint arXiv:1905.07213.", "model_explanation_gemini": "RuBERT is a Russian-language BERT model trained on Wikipedia and news data for masked language modeling and next sentence prediction tasks." }