Fine-tune ready versions of the LLMSQL benchmark Collection This collection contains the versions of the benchmark in fine-tune ready format • 2 items • Updated 9 days ago • 1
Beyond Language Modeling: An Exploration of Multimodal Pretraining Paper • 2603.03276 • Published 9 days ago • 87
The Million-Label NER: Breaking Scale Barriers with GLiNER bi-encoder Paper • 2602.18487 • Published 29 days ago • 5
Optimal Turkish Subword Strategies at Scale: Systematic Evaluation of Data, Vocabulary, Morphology Interplay Paper • 2602.06942 • Published Feb 6 • 3
MetricX-24 Collection A collection of MetricX-24 models (https://aclanthology.org/2024.wmt-1.35/) • 6 items • Updated about 13 hours ago • 11
FiNERweb: Datasets and Artifacts for Scalable Multilingual Named Entity Recognition Paper • 2512.13884 • Published Dec 15, 2025 • 15
Massively Multilingual Adaptation of Large Language Models Using Bilingual Translation Data Paper • 2506.00469 • Published May 31, 2025 • 4
Scaling Low-Resource MT via Synthetic Data Generation with LLMs Paper • 2505.14423 • Published May 20, 2025 • 2
view article Article Introducing MTEB v2: Evaluation of embedding and retrieval systems for more than just text Oct 20, 2025 • 37
view article Article Luth: Efficient French Specialization for Small Language Models Aug 11, 2025 • 18