Model Description

Fine-tuning of XLM-RoBERTa-Uk model on Ukrainian texts to recover punctuation and case.

How to Use

Download script get_predictions.py from the repository.

from transformers import AutoTokenizer, AutoModelForTokenClassification
from get_predictions import recover_text

tokenizer = AutoTokenizer.from_pretrained('ukr-models/uk-punctcase')
model = AutoModelForTokenClassification.from_pretrained('ukr-models/uk-punctcase')

text = "..."
recover_text(text_processed, model, tokenizer)

Downloads last month: 4

Safetensors

Model size

0.1B params

Tensor type

I64

F32