| --- |
| license: apache-2.0 |
| library_name: transformers |
| pipeline_tag: text-generation |
| --- |
| |
| # CoCoReviewBench Model |
|
|
| This repository contains a model checkpoint from the paper [CoCoReviewBench: A Completeness- and Correctness-Oriented Benchmark for AI Reviewers](https://huggingface.co/papers/2605.07905). |
|
|
| ## Introduction |
| CoCoReviewBench is a benchmark designed for reliable and fine-grained evaluation of AI reviewers. It curates 3,900 papers from ICLR and NeurIPS, focusing on: |
| - **Completeness**: Evaluating by category to avoid penalizing models for issues missing in human references. |
| - **Correctness**: Filtering human reviews using reviewer-author-meta-reviewer discussions to ensure accuracy. |
|
|
| ## Links |
| - **Paper**: [CoCoReviewBench: A Completeness- and Correctness-Oriented Benchmark for AI Reviewers](https://huggingface.co/papers/2605.07905) |
| - **Code**: [Official GitHub Repository](https://github.com/hexuandeng/CoCoReviewBench) |
|
|
| ## Citation |
|
|
| ```bibtex |
| @inproceedings{deng2026cocoreviewbench, |
| title = {{CoCoReviewBench}: A Completeness- and Correctness-Oriented Benchmark for {AI} Reviewers}, |
| author = {Deng, Hexuan and Li, Yichen and Ke, Xiaopeng and Hu, Ruina and Wong, Derek F. and Wang, Yue and Liu, Xuebo and Huang, Dehao and Zhang, Min}, |
| booktitle = {Proceedings of the 43rd International Conference on Machine Learning}, |
| series = {Proceedings of Machine Learning Research}, |
| publisher = {PMLR}, |
| year = {2026}, |
| note = {To appear}, |
| url = {https://github.com/hexuandeng/CoCoReviewBench} |
| } |
| ``` |