---
license: apache-2.0
library_name: transformers
pipeline_tag: text-generation
base_model: qwen/Qwen3-8B
tags:
- academic-rebuttal
- agentic-framework
- rl
---

# DRPG Judge Model

This repository contains the Judge Model for the **DRPG (Decompose, Retrieve, Plan, Generate)** framework, as introduced in the paper [DRPG (Decompose, Retrieve, Plan, Generate): An Agentic Framework for Academic Rebuttal](https://huggingface.co/papers/2601.18081).

The model is specifically designed to evaluate the quality of academic rebuttals. It was trained from **Qwen3-8B** using Reinforcement Learning (RL) to provide accurate and persuasive assessment scores.

## Links
- **Paper:** [DRPG: An Agentic Framework for Academic Rebuttal](https://huggingface.co/papers/2601.18081)
- **Repository:** [ulab-uiuc/DRPG-RebuttalAgent](https://github.com/ulab-uiuc/DRPG-RebuttalAgent)

## About DRPG
DRPG is an agentic framework for automatic academic rebuttal generation that operates through four steps:
1. **Decompose**: Breaking reviews into atomic concerns.
2. **Retrieve**: Finding relevant evidence from the paper.
3. **Plan**: Identifying feasible rebuttal strategies.
4. **Generate**: Creating targeted responses.

The Judge Model is used within this pipeline to assess rebuttal quality, achieving performance beyond the average human level in experimental evaluations.

## Usage
Refer to the official [GitHub repository](https://github.com/ulab-uiuc/DRPG-RebuttalAgent) for instructions on running the evaluation scripts and using the model within the DRPG pipeline.

## Citation
If you find this model useful in your research, please cite:
```bibtex
@article{han2025drpg,
  title={DRPG (Decompose, Retrieve, Plan, Generate): An Agentic Framework for Academic Rebuttal},
  author={Han, Peixuan and Yu, Yingjie and Xu, Jingjun and You, Jiaxuan},
  journal={arXiv preprint arXiv:2601.18081},
  url={https://arxiv.org/pdf/2601.18081},
  year={2026}
}
```