ruRoberta-large-rucola-science
This model is a fine-tuned version of p1746-lingua/ruRoberta-large-rucola on the dataset with errors in scientific texts. It predicts whether a given sentence contains errors.
Key Features
- Task: Binary classification (correct vs. contains errors)
- Training data: https://huggingface.co/datasets/p1746-lingua/correct_incorrect_sents (~2.4 labeled sentences)
- Max sequence length: 512 tokens
- Fine-tuning framework: PyTorch + Hugging Face transformers
Hyperparameters
| Parameter | Value |
|---|---|
| Batch size | 32 |
| Learning rate | 1e-5 |
| Epochs | 64 |
| Warmup steps | 100 |
| Optimizer | adamw_bnb_8bit |
- Downloads last month
- 2
Model tree for p1746-lingua/ruRoberta-large-rucola-science
Base model
ai-forever/ruRoberta-large Finetuned
p1746-lingua/ruRoberta-large-rucola