xCOMET-lite: Bridging the Gap Between Efficiency and Quality in Learned MT Evaluation Metrics Paper • 2406.14553 • Published Jun 20, 2024 • 2
ViSTa Dataset: Do vision-language models understand sequential tasks? Paper • 2411.13211 • Published Nov 20, 2024
When Punctuation Matters: A Large-Scale Comparison of Prompt Robustness Methods for LLMs Paper • 2508.11383 • Published Aug 15, 2025 • 40
TikZero Collection Zero-Shot Text-Guided Graphics Program Synthesis • 5 items • Updated Jun 27, 2025 • 1
DeTikZify Collection Synthesizing Graphics Programs for Scientific Figures and Sketches with TikZ • 13 items • Updated Jun 27, 2025 • 30
NLLG Quarterly arXiv Report 09/24: What are the most influential current AI Papers? Paper • 2412.12121 • Published Dec 2, 2024
DeepSeek vs. o3-mini: How Well can Reasoning LLMs Evaluate MT and Summarization? Paper • 2504.08120 • Published Apr 10, 2025 • 3
TikZero Collection Zero-Shot Text-Guided Graphics Program Synthesis • 5 items • Updated Jun 27, 2025 • 1