Lang UK

non-profit

https://lang.org.ua/en/

Activity Feed Request to join this org

AI & ML interests

NLP

Recent Activity

stefan-it submitted a paper 7 days ago

Repetition over Diversity: High-Signal Data Filtering for Sample-Efficient German Language Modeling

robinhad new activity 7 days ago

lang-uk/ukrainian-llm-leaderboard-results:Results for Gemma4 reasoning

dchaplinsky updated a dataset 13 days ago

lang-uk/qe-vs-human

View all activity

submitted a paper to Daily Papers 7 days ago

Repetition over Diversity: High-Signal Data Filtering for Sample-Efficient German Language Modeling

Paper • 2604.28075 • Published 12 days ago • 19

in lang-uk/ukrainian-llm-leaderboard-results 7 days ago

Results for Gemma4 reasoning

#4 opened 7 days ago by

transhumanist-already-exists

updated a dataset 13 days ago

lang-uk/qe-vs-human

Viewer • Updated 13 days ago • 20k • 28

published 2 datasets 15 days ago

lang-uk/perestoroha-ocr

Updated 15 days ago • 20

lang-uk/qe-vs-human

Viewer • Updated 13 days ago • 20k • 28

in lang-uk/ukrainian-llm-leaderboard-results about 2 months ago

Results for FP8 static-quantized MamayLM-12B-IT

#1 opened about 2 months ago by

submitted a paper to Daily Papers 3 months ago

FineInstructions: Scaling Synthetic Instructions to Pre-Training Scale

Paper • 2601.22146 • Published Jan 29 • 11

updated a Space 5 months ago

Ukrainian LLM Leaderboard

Measuring LLM capabilities to process Ukrainian texts

published a Space 5 months ago

Ukrainian LLM Leaderboard

Measuring LLM capabilities to process Ukrainian texts

updated a dataset 5 months ago

lang-uk/ukrainian-llm-leaderboard-results

Updated 7 days ago • 67 • 1

published a dataset 5 months ago

lang-uk/ukrainian-llm-leaderboard-results

Updated 7 days ago • 67 • 1

authored a paper 7 months ago

SindBERT, the Sailor: Charting the Seas of Turkish NLP

Paper • 2510.21364 • Published Oct 24, 2025 • 2

updated a dataset 7 months ago

lang-uk/FiftyFiveShades

Viewer • Updated Oct 20, 2025 • 53.5M • 236 • 4

updated a model 7 months ago

lang-uk/ukr-clip-vit-h-14-frozen-xlm-roberta-large-laion5B-s13B-b90k

Zero-Shot Image Classification • Updated Oct 17, 2025 • 2

updated a model 7 months ago

lang-uk/ukr-clip-vit-h-14-frozen-xlm-roberta-large-laion5B-s13B-b90k

Zero-Shot Image Classification • Updated Oct 17, 2025 • 2

authored a paper 7 months ago

The German Commons - 154 Billion Tokens of Openly Licensed Text for German Language Models

Paper • 2510.13996 • Published Oct 15, 2025 • 9

published a model 7 months ago

lang-uk/ukr-clip-vit-h-14-frozen-xlm-roberta-large-laion5B-s13B-b90k

Zero-Shot Image Classification • Updated Oct 17, 2025 • 2

updated a dataset 7 months ago

lang-uk/WikiEdits-MultiGEC

Viewer • Updated Oct 6, 2025 • 61.5k • 60 • 4

authored a paper 8 months ago

Introducing OmniGEC: A Silver Multilingual Dataset for Grammatical Error Correction

Paper • 2509.14504 • Published Sep 18, 2025

updated a collection 8 months ago

OmniGEC

This is a collection of multilingual silver-standard datasets and models for the task of Grammatical Error Correction (GEC). • 9 items • Updated Sep 19, 2025 • 8