Finding Blind Spots in Evaluator LLMs with Interpretable Checklists Paper β’ 2406.13439 β’ Published Jun 19, 2024 β’ 1
Pangea: A Fully Open Multilingual Multimodal LLM for 39 Languages Paper β’ 2410.16153 β’ Published Oct 21, 2024 β’ 44
Scaling Synthetic Data Creation with 1,000,000,000 Personas Paper β’ 2406.20094 β’ Published Jun 28, 2024 β’ 104
MILU: A Multi-task Indic Language Understanding Benchmark Paper β’ 2411.02538 β’ Published Nov 4, 2024 β’ 2
IndicTrans2 Collection Models(En-Indic, Indic-En, Indic-Indic) in 2 variants (base and dist) and Benchmarks (IN22-Gen and IN22-Conv) released as a part of IndicTrans2. β’ 10 items β’ Updated Sep 5 β’ 24
Open LLM Leaderboard best models β€οΈβπ₯ Collection A daily uploaded list of models with best evaluations on the LLM leaderboard: β’ 65 items β’ Updated Mar 20 β’ 656