Small Language Models for Kazakh: models, tokenizers, and datasets for Kazakh language modeling.
-
stukenov/sozkz-core-llama-50m-kk-base-v2
50.6M β’ Updated β’ 30 β’ 2 -
stukenov/sozkz-corpus-clean-kk-pretrain-v2
Viewer β’ Updated β’ 1.02M β’ 45 -
stukenov/sozkz-corpus-clean-kk-text-v2
Viewer β’ Updated β’ 19M β’ 21 -
stukenov/sozkz-core-pythia-14m-kk-dapt-v1
Text Generation β’ 14.1M β’ Updated β’ 31