TUKE-KEMT
/

slavic-t5-base

text2text-generation

text-generation-inference

Model card Files Files and versions

Slavic T5 Base

Aim of this model is to reach the best results for the Slavic laguages with Latin script.

It is suitable for tasks such as:

summarization,
extractive question answering,
machine translation between slavic languages in Latin script.

The model is trained on the selected parts of OSCAR corpus and MaCoCu corpus.

It supports this languages: Czech, Croatian, Polish , Slovak, Slovenian,

Vocabulary has 120 000 tokens, contains capital letters.

Downloads last month: 32

Safetensors

Model size

0.4B params

Tensor type

F32

·

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Datasets used to train TUKE-KEMT/slavic-t5-base