deepvk
/

roberta-base

Feature Extraction

text-embeddings-inference

Model card Files Files and versions

zemerov commited on Jul 24, 2023

Commit

5a19b1f

·

1 Parent(s): 321f771

Create README.md

Files changed (1) hide show

README.md +102 -0

README.md ADDED Viewed

	@@ -0,0 +1,102 @@

+---
+license: apache-2.0
+language:
+- ru
+- en
+library_name: transformers
+---
+# RoBERTa-base from deepvk
+<!-- Provide a quick summary of what the model is/does. -->
+Pretrained bidirectional encoder for russian language.
+## Model Details
+### Model Description
+<!-- Provide a longer summary of what this model is. -->
+Model was pretrained using standard MLM objective on a large text corpora including open social data, books, Wikipedia, webpages etc.
+- **Developed by:** VK Applied Research Team
+- **Model type:** RoBERTa
+- **Languages:** Mostly russian and small fraction of other languages
+- **License:** Apache 2.0
+## How to Get Started with the Model
+```
+from transformers import AutoTokenizer, AutoModel
+tokenizer = AutoTokenizer.from_pretrained("deepvk/roberta-base")
+model = AutoModel.from_pretrained("deepvk/roberta-base")
+text = "Привет, мир!"
+inputs = tokenizer(text, return_tensors='pt')
+predictions = model(**inputs)
+```
+## Training Details
+### Training Data
+<!-- This should link to a Data Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
+Mix of the following data:
+### Training Procedure
+<!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
+#### Preprocessing [optional]
+[More Information Needed]
+#### Training Hyperparameters
+- **Training regime:** [More Information Needed] <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->
+#### Speeds, Sizes, Times [optional]
+Standard RoBERTA-base size;
+## Evaluation
+<!-- This section describes the evaluation protocols and provides the results. -->
+### Testing Data, Factors & Metrics
+#### Testing Data
+<!-- This should link to a Data Card if possible. -->
+[More Information Needed]
+#### Factors
+<!-- These are the things the evaluation is disaggregating by, e.g., subpopulations or domains. -->
+[More Information Needed]
+#### Metrics
+<!-- These are the evaluation metrics being used, ideally with a description of why. -->
+[More Information Needed]
+### Results
+[More Information Needed]
+#### Summary
+## Compute Infrastructure
+Model was trained using 8xA100 for ~22 days.