Rebuttal-RM / README.md
LYUZongwei's picture
Initial release
eaa6b9c verified
|
raw
history blame
6.19 kB
metadata
license: apache-2.0
model_type:
  - reward-model
domain:
  - nlp
language:
  - en
  - zh
metrics:
  - spearman
tags:
  - evaluator
  - reward-model
  - rebuttal
tools:
  - vllm
  - openai api

Rebuttal-RM 🏅

Introduction

Rebuttal-RM is an open-source scorer that emulates human reviewers when judging an author’s reply to a peer-review comment.

1.1 Purpose

  • Produce four rubric-aligned 0-to-10 scores
    1. Attitude (tone & professionalism)
    2. Clarity (logical flow)
    3. Persuasiveness (strength of evidence)
    4. Constructiveness (actionable improvement)
  • Return a short textual explanation for transparency.
  • Serve both as (i) a public benchmark and (ii) the reward signal for RebuttalAgent RL.

1.2 Model I/O

Input : {relevant paper chunks} + {full review Ri} + {target comment} + {candidate response} Output: { "score": {Attitude, Clarity, Persuasiveness, Constructiveness}, "explanation": "..." }

1.3 Model Recipe

  • Backbone: Qwen-3-8B-Chat
  • Supervised fine-tuning on the RM_Bench dataset

The resulting model offers reproducible, high-fidelity evaluation


2 Performance (agreement with human ratings)

Table – Consistency between automatic scores and human ratings

Higher values indicate better agreement.

Consistency between automatic scores and human ratings (higher = better)

Scoring Model Attitude r Attitude β Attitude f Clarity r Clarity β Clarity f Persuasiveness r Persuasiveness β Persuasiveness f Constructiveness r Constructiveness β Constructiveness f Avg
Qwen-3-8B 0.718 0.672 0.620 0.609 0.568 0.710 0.622 0.577 0.690 0.718 0.745 0.720 0.664
Llama-3-8B 0.297 0.347 0.540 0.158 0.047 0.380 0.272 0.245 0.560 0.424 0.457 0.460 0.349
GLM-4-9B 0.420 0.475 0.460 0.467 0.436 0.730 0.369 0.361 0.700 0.561 0.519 0.570 0.506
GPT-4.1 0.743 0.712 0.800 0.739 0.671 0.750 0.779 0.763 0.740 0.804 0.756 0.680 0.745
DeepSeek-R1 0.646 0.633 0.790 0.708 0.615 0.760 0.710 0.664 0.720 0.742 0.701 0.620 0.705
DeepSeek-V3 0.699 0.733 0.710 0.687 0.578 0.740 0.697 0.652 0.770 0.771 0.719 0.750 0.692
Gemini-2.5 0.620 0.509 0.750 0.605 0.593 0.540 0.627 0.607 0.520 0.711 0.705 0.610 0.616
Claude-3.5 0.569 0.635 0.720 0.704 0.670 0.680 0.706 0.686 0.670 0.753 0.738 0.630 0.680
Rebuttal-RM 0.839 0.828 0.910 0.753 0.677 0.790 0.821 0.801 0.820 0.839 0.835 0.810 0.812

3 Deployment / Usage

3.1 Run with vLLM (OpenAI protocol)

pip install "vllm>=0.4.2"
python -m vllm.entrypoints.openai.api_server \
  --model Zhitao-He/Rebuttal-RM \
  --dtype auto \
  --port 8000

3.2 Query via the official openai SDK

python

import openai
openai.api_key  = "EMPTY"
openai.base_url = "http://localhost:8000/v1"

prompt = f"""
The whole review content is{Full_Review_Content}
The target comment content is{Target_Comment}.
The best relevant paper fragment is{Relevant_Paper_Fragment}.
The response is{response}
"""

resp = openai.chat.completions.create(
    model="Zhitao-He/Rebuttal-RM",
    messages=[
        {"role": "system", "content": system_prompt}, 
        {"role": "user",   "content": user_prompt}
    ],
    temperature=0.7
)
print(resp.choices[0].message.content)

3.3 Default System Prompt (system_prompt)

You are a seasoned academic reviewer and response optimization expert. Your task is to evaluate the quality of the response based on the review comments, paper fragments, and the authors' responses. Please strictly follow the requirements below, and output only the score and score explanation.

Input variables:

review_content: Complete content of the review comments. similar_paper_fragments: Best paper fragment most relevant to the comment. comment: Specific segment of the review comments. original_response: The authors' original response text to the comment.

Your task: Based on the input information, output only a JSON object containing the following two items:
Scoring Standard
Score Range: 0 - 10
0: Wholly Ineffective
1-2: Perfunctory
3-4: Unconvincing
5-6: Addresses Some Concerns
7-8: Exceptional
9-10: Outstanding
score: Four-dimensional score breakdown, ranging from 0-10, structured as follows:
Attitude: The tone and professionalism of the response. 
Clarity: The logic, structure, and focus of the response.
Persuasiveness: The effectiveness of the argumentation and evidence support. 
Constructiveness: The commitment to revisions and specific actions taken.

score_explanation: A brief explanation of each score, specifically citing key points from the response text that reflect the scores and any shortcomings.

Output requirements:

Only output the JSON object; do not include any other characters or explanations.
The scoring must be reasonable, and the score explanation must clearly reference the original text that reflects the score. 
All output must be in formal, polite academic English.
Your output must be strictly JSON formal which can be dirctly loaded by json.load()
Output format example:
{ "score": { "Attitude": <int>,
              "Clarity": <int>, 
              "Persuasiveness": <int>,
              "Constructiveness": <int> }, 
  "score_explanation": <explanation for your given score>}"""

4 Citation

@inproceedings{he2025rebuttalagent,
  title     = {RebuttalAgent: Strategic Persuasion in Academic Rebuttal via Theory of Mind},
  author    = {Zhitao He and Zongwei Lyu and Wuzhenhai Dai and Yi R. (May) Fung},
  year      = {2025},
  institution = {Hong Kong University of Science and Technology},
  url       = {https://arxiv.org/abs/YYMM.NNNNN}
}