metadata
license: apache-2.0
library_name: transformers
pipeline_tag: text-generation
CoCoReviewBench Model
This repository contains a model checkpoint from the paper CoCoReviewBench: A Completeness- and Correctness-Oriented Benchmark for AI Reviewers.
Introduction
CoCoReviewBench is a benchmark designed for reliable and fine-grained evaluation of AI reviewers. It curates 3,900 papers from ICLR and NeurIPS, focusing on:
- Completeness: Evaluating by category to avoid penalizing models for issues missing in human references.
- Correctness: Filtering human reviews using reviewer-author-meta-reviewer discussions to ensure accuracy.
Links
- Paper: CoCoReviewBench: A Completeness- and Correctness-Oriented Benchmark for AI Reviewers
- Code: Official GitHub Repository
Citation
@inproceedings{deng2026cocoreviewbench,
title = {{CoCoReviewBench}: A Completeness- and Correctness-Oriented Benchmark for {AI} Reviewers},
author = {Deng, Hexuan and Li, Yichen and Ke, Xiaopeng and Hu, Ruina and Wong, Derek F. and Wang, Yue and Liu, Xuebo and Huang, Dehao and Zhang, Min},
booktitle = {Proceedings of the 43rd International Conference on Machine Learning},
series = {Proceedings of Machine Learning Research},
publisher = {PMLR},
year = {2026},
note = {To appear},
url = {https://github.com/hexuandeng/CoCoReviewBench}
}