Repetition over Diversity: High-Signal Data Filtering for Sample-Efficient German Language Modeling
Paper • 2604.28075 • Published • 18
None defined yet.
FiNERweb: Datasets and Artifacts for Scalable Multilingual Named Entity Recognition
FLERT: Document-Level Features for Named Entity Recognition