Papers
arxiv:2602.23184

MTRAG-UN: A Benchmark for Open Challenges in Multi-Turn RAG Conversations

Published on Feb 26
Authors:
,
,
,
,
,

Abstract

MTRAG-UN presents a benchmark for multi-turn retrieval-augmented generation containing 666 tasks across 6 domains to evaluate challenges in conversational information retrieval.

AI-generated summary

We present MTRAG-UN, a benchmark for exploring open challenges in multi-turn retrieval augmented generation, a popular use of large language models. We release a benchmark of 666 tasks containing over 2,800 conversation turns across 6 domains with accompanying corpora. Our experiments show that retrieval and generation models continue to struggle on conversations with UNanswerable, UNderspecified, and NONstandalone questions and UNclear responses. Our benchmark is available at https://github.com/IBM/mt-rag-benchmark

Community

Sign up or log in to comment

Get this paper in your agent:

hf papers read 2602.23184
Don't have the latest CLI?
curl -LsSf https://hf.co/cli/install.sh | bash

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2602.23184 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2602.23184 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2602.23184 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.