ViDoRe V3: A Comprehensive Evaluation of Retrieval Augmented Generation in Complex Real-World Scenarios Paper • 2601.08620 • Published 27 days ago • 11
Does It Tie Out? Towards Autonomous Legal Agents in Venture Capital Paper • 2512.18658 • Published Dec 21, 2025 • 11
Surfer 2: The Next Generation of Cross-Platform Computer Use Agents Paper • 2510.19949 • Published Oct 22, 2025 • 39
ViDoRe Benchmark V2: Raising the Bar for Visual Retrieval Paper • 2505.17166 • Published May 22, 2025 • 1
ModernVBERT: Towards Smaller Visual Document Retrievers Paper • 2510.01149 • Published Oct 1, 2025 • 32
ModernVBERT: Towards Smaller Visual Document Retrievers Paper • 2510.01149 • Published Oct 1, 2025 • 32
When Does Reasoning Matter? A Controlled Study of Reasoning's Contribution to Model Performance Paper • 2509.22193 • Published Sep 26, 2025 • 38
CometKiwi: IST-Unbabel 2022 Submission for the Quality Estimation Shared Task Paper • 2209.06243 • Published Sep 13, 2022
Looking for a Needle in a Haystack: A Comprehensive Study of Hallucinations in Neural Machine Translation Paper • 2208.05309 • Published Aug 10, 2022 • 1
Enhanced Hallucination Detection in Neural Machine Translation through Simple Detector Aggregation Paper • 2402.13331 • Published Feb 20, 2024 • 2
The Inside Story: Towards Better Understanding of Machine Translation Neural Evaluation Metrics Paper • 2305.11806 • Published May 19, 2023
Steering Large Language Models for Machine Translation with Finetuning and In-Context Learning Paper • 2310.13448 • Published Oct 20, 2023 • 1
xCOMET: Transparent Machine Translation Evaluation through Fine-grained Error Detection Paper • 2310.10482 • Published Oct 16, 2023 • 4
Scaling up COMETKIWI: Unbabel-IST 2023 Submission for the Quality Estimation Shared Task Paper • 2309.11925 • Published Sep 21, 2023
xTower: A Multilingual LLM for Explaining and Correcting Translation Errors Paper • 2406.19482 • Published Jun 27, 2024
Zero-shot Benchmarking: A Framework for Flexible and Scalable Automatic Evaluation of Language Models Paper • 2504.01001 • Published Apr 1, 2025 • 1