view article Article Best Practices for Open Multilingual LLM Evaluation catherinearnett • May 7, 2025 • 8
view article Article An Analysis of Multilingual Models on Hugging Face catherinearnett • Sep 18, 2025 • 6
MMTEB: Massive Multilingual Text Embedding Benchmark Paper • 2502.13595 • Published Feb 19, 2025 • 49
BGE M3-Embedding: Multi-Lingual, Multi-Functionality, Multi-Granularity Text Embeddings Through Self-Knowledge Distillation Paper • 2402.03216 • Published Feb 5, 2024 • 10
Reading, Not Thinking: Understanding and Bridging the Modality Gap When Text Becomes Pixels in Multimodal LLMs Paper • 2603.09095 • Published Mar 10 • 29
ChiKhaPo: A Large-Scale Multilingual Benchmark for Evaluating Lexical Comprehension and Generation in Large Language Models Paper • 2510.16928 • Published Oct 19, 2025 • 4
ChiKhaPo: A Large-Scale Multilingual Benchmark for Evaluating Lexical Comprehension and Generation in Large Language Models Paper • 2510.16928 • Published Oct 19, 2025 • 4
SynthTextEval: Synthetic Text Data Generation and Evaluation for High-Stakes Domains Paper • 2507.07229 • Published Jul 9, 2025 • 11
view article Article mmBERT: ModernBERT goes Multilingual +4 mmarone, orionweller, will-fleshman, eugene-yang, dlawrie, vandurme • Sep 9, 2025 • 148
OLMoTrace: Tracing Language Model Outputs Back to Trillions of Training Tokens Paper • 2504.07096 • Published Apr 9, 2025 • 78
Seq vs Seq: An Open Suite of Paired Encoders and Decoders Paper • 2507.11412 • Published Jul 15, 2025 • 33
The Translation Barrier Hypothesis: Multilingual Generation with Large Language Models Suffers from Implicit Translation Failure Paper • 2506.22724 • Published Jun 28, 2025 • 10