Qwen3 models (123M/300M/600M) trained from scratch on 2.47B kk+ru tokens. Includes tokenizer, datasets, and checkpoints.
Saken Tukenov PRO
stukenov
AI & ML interests
None yet
Recent Activity
liked a model 6 days ago
AIDC-AI/Marco-Mini-Base liked a model 6 days ago
AxisCommunity/OrionPaxAI_1.0V updated a dataset 7 days ago
stukenov/sozkz-corpus-tokenized-kk-morphbpe256k-v1Organizations
models 74
stukenov/sozkz-morphbpe-256k-kk-v1
Token Classification • Updated
stukenov/sozkz-fix-mt5-50m-kk-gec-v1
Text Generation • 50.6M • Updated • 19
stukenov/sozkz-nllb-1b-kk-pretrain-v1
Translation • 1B • Updated • 51
stukenov/sozkz-nllb-1b-kk-gec-v1
1B • Updated • 64
stukenov/sozkz-fix-mt5b-kk-gec-run13-v1
Text Generation • 0.6B • Updated • 3
stukenov/sozkz-fix-qwen-500m-kk-gec-v1
Text Generation • 0.4B • Updated • 11
stukenov/sozkz-fix-qwen-500m-kk-gec-v2
Text Generation • 0.4B • Updated • 33
stukenov/sozkz-fix-qwen-500m-kk-gec-v3
Text Generation • 0.4B • Updated • 321
stukenov/sozkz-fix-qwen-500m-kk-gec-v4
Text Generation • 0.4B • Updated • 317
stukenov/sozkz-core-llama-300m-kk-gec-v1
Text Generation • 0.3B • Updated • 2
datasets 54
stukenov/sozkz-corpus-tokenized-kk-morphbpe256k-v1
Viewer • Updated • 1.51M • 45
stukenov/sozkz-corpus-segmented-kk-v1
Viewer • Updated • 55.5M • 385
stukenov/sozkz-corpus-gec-benchmark-kk-v1
Viewer • Updated • 1.44k • 167
stukenov/sozkz-corpus-pretrain-gec-mix-v1
Viewer • Updated • 1.77M • 82
stukenov/sozkz-corpus-synthetic-kk-gec-rulebased-v1
Viewer • Updated • 1.06M • 37
stukenov/sozkz-corpus-synthetic-kk-gec-v1
Viewer • Updated • 19.3k • 63
stukenov/sozkz-gec-synthetic-gpt4o-v1
Viewer • Updated • 9.6k • 34
stukenov/sozkz-corpus-clean-v3
Viewer • Updated • 13.5M • 45
stukenov/sozkz-corpus-instruct-kk-alpaca-qwen35-v1
Viewer • Updated • 4.88k • 23 • 1
stukenov/kaznet-crawl-raw
Viewer • Updated • 1.55M • 7 • 1