Evaluating Arabic Large Language Models: A Survey of Benchmarks, Methods, and Gaps Paper • 2510.13430 • Published Oct 15, 2025 • 2