Papers
arxiv:2606.17458

ICBCBench: An Industry Consortium Benchmark for Financial Deep Research

Published on Jun 16
Authors:
,
,
,
,
,
,
,
,
,
,
,
,

Abstract

A consortium-driven benchmark for financial deep research is introduced that combines objective and subjective evaluation to assess both retrieval-reasoning accuracy and report quality in real-world applications.

With the rapid advancement of Deep Research Agents in knowledge-intensive domains such as finance, establishing reliable and domain-aligned evaluation standards remains a critical challenge. Existing benchmarks focus on either closed-ended question answering or open-ended report evaluation, failing to jointly capture retrieval-reasoning accuracy and end-to-end research quality required in real-world workflows. We introduce ICBCBench, a consortium-driven benchmark for financial deep research, developed in collaboration with domain experts from a broad range of financial institutions and academia, involving over 50 experts across more than 40 organizations. It adopts a dual-track paradigm integrating objective tasks with verifiable answers and subjective long-form report evaluation, enabling complementary assessment of retrieval-reasoning accuracy and end-to-end report quality in terms of expert alignment, citation consistency, and source quality. Experiments on state-of-the-art DRAs and large language models reveal substantial gaps in complex reasoning, factual grounding, and report quality, highlighting the challenges of achieving industry-level performance. Our dataset and evaluation framework are available at https://github.com/DeepFin-Intelligence/ICBCBench.

Community

Sign up or log in to comment

Get this paper in your agent:

hf papers read 2606.17458
Don't have the latest CLI?
curl -LsSf https://hf.co/cli/install.sh | bash

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2606.17458 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2606.17458 in a dataset README.md to link it from this page.

Spaces citing this paper 1

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.