ruBert-base for Punctuation Correction

The model is built upon the foundation of ruBert-base and has been fine-tuned to correctly place punctuation marks in Russian sentences (it predicts the mark after each word).

Some additional info about the model:

Fine-Tuning Source: The model has undergone fine-tuning using a diverse dataset comprising over 20,000 paragraphs from Russian literary works.
Supported Classes: The model is designed to predict classes following specific punctuation marks: ? ! . , : ... and space (as class O).
Input Format: To achieve optimal results, input text should be provided without punctuation marks. The model does not process changes in letter case.

Usage Guidelines

To use the model effectively, follow these guidelines:

Input Text: Feed the model with text excluding punctuation marks.
Letter Case: The model does not recognize changes in letter case.

Authors

Mark Stolyarov

Downloads last month: 147

Safetensors

Model size

0.2B params

Tensor type

F32