Elvis-t9 commited on
Commit
9050771
·
verified ·
1 Parent(s): 75eae10

update model card

Browse files

add model intro, quick start, citation, etc.

Files changed (1) hide show
  1. README.md +221 -1
README.md CHANGED
@@ -9,4 +9,224 @@ library_name: transformers
9
  tags:
10
  - rm
11
  - cr
12
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
9
  tags:
10
  - rm
11
  - cr
12
+ ---
13
+
14
+ # SWE-CARE-RM
15
+
16
+ This model is a custom reward model built on top of **Qwen3-8B** with:
17
+
18
+ - a merged **LoRA** adapter
19
+ - an additional **projector head**
20
+ - a scalar reward output in **[0, 1]**
21
+
22
+ The model is designed to score the quality of a review conditioned on:
23
+
24
+ 1. an issue / problem statement
25
+ 2. a code patch
26
+ 3. a candidate review
27
+
28
+ A higher score means the model considers the review better under the given issue and patch.
29
+
30
+ ## Model Architecture
31
+
32
+ The model consists of:
33
+
34
+ - base model: **Qwen3-8B**
35
+ - adaptation: **LoRA**
36
+ - reward head: a custom **MLP projector**
37
+ - final score: `sigmoid(projector(last_hidden_state[:, -1]))`
38
+
39
+ This repository contains the **merged decoder weights** together with `projector.pth`.
40
+
41
+ ## Input Format
42
+
43
+ The model expects three text fields:
44
+
45
+ - `issue`
46
+ - `patch`
47
+ - `review`
48
+
49
+ During inference, the input is formatted as:
50
+
51
+ ```latex
52
+ <issue>{issue}</issue><patch>{patch}</patch><review>{review}<review>
53
+ ```
54
+
55
+ The score is computed from the last token hidden state.
56
+
57
+ ## Quick Start
58
+
59
+ ```latex
60
+ from pathlib import Path
61
+ import json
62
+
63
+ import torch
64
+ import torch.nn as nn
65
+ from transformers import AutoModelForCausalLM, AutoTokenizer
66
+
67
+
68
+ MODEL_DIR = "codefuse-ai/SWE-CARE-RM"
69
+ MAX_SEQ_LEN = 51200
70
+ MIN_REVIEW_LEN = 4096
71
+ TRUST_REMOTE_CODE = True
72
+
73
+ with open(f"{MODEL_DIR}/data_sample.jsonl", "r") as fr:
74
+ for line in fr:
75
+ json_data = json.loads(line)
76
+ break
77
+
78
+ SAMPLE = {
79
+ "issue": json_data['problem_statement'],
80
+ "patch": json_data['patch_to_review'],
81
+ "review": json_data['pos_review'][0]
82
+ }
83
+
84
+ class Projector(nn.Module):
85
+ def __init__(self, arch, input_size, hidden_size, use_bf16):
86
+ super().__init__()
87
+ depth = int(arch[len("mlp"): arch.index("x_relu")])
88
+ layers = [nn.Linear(input_size, hidden_size).bfloat16() if use_bf16 else
89
+ nn.Linear(input_size, hidden_size)]
90
+ for _ in range(1, depth):
91
+ layers.append(nn.ReLU())
92
+ layers.append(nn.Linear(hidden_size, 1).bfloat16() if use_bf16 else
93
+ nn.Linear(hidden_size, 1))
94
+ self.model = nn.Sequential(*layers)
95
+
96
+ def forward(self, x):
97
+ return self.model(x)
98
+
99
+
100
+ def resolve_dtype(dtype_name):
101
+ if dtype_name in {"bf16", "bfloat16"}:
102
+ return torch.bfloat16
103
+ if dtype_name in {"fp16", "float16"}:
104
+ return torch.float16
105
+ return torch.float32
106
+
107
+
108
+ def infer_proj_arch(projector_state_dict):
109
+ linear_weight_keys = [k for k in projector_state_dict if k.startswith("model.")
110
+ and k.endswith(".weight")]
111
+ return f"mlp{len(linear_weight_keys)}x_relu"
112
+
113
+
114
+ def process_one(issue_ids, issue_masks, patch_ids, patch_masks, review_ids,
115
+ review_masks, max_len, min_review_len):
116
+ review_keep = min(min_review_len, len(review_ids))
117
+ remain_for_patch = max(max_len - len(issue_ids) - review_keep, 0)
118
+ patch_keep = min(len(patch_ids), remain_for_patch)
119
+
120
+ ids_all = issue_ids + patch_ids[:patch_keep] + review_ids[-review_keep:]
121
+ masks_all = issue_masks + patch_masks[:patch_keep] + review_masks[-review_keep:]
122
+
123
+ if len(ids_all) < max_len:
124
+ pad_len = max_len - len(ids_all)
125
+ ids_all = [0] * pad_len + ids_all
126
+ masks_all = [0] * pad_len + masks_all
127
+
128
+ return ids_all[:max_len], masks_all[:max_len]
129
+
130
+
131
+ reward_config = {}
132
+ reward_config_path = Path(MODEL_DIR) / "reward_config.json"
133
+ if reward_config_path.exists():
134
+ reward_config = json.load(open(reward_config_path, "r", encoding="utf-8"))
135
+
136
+ projector_path = Path(MODEL_DIR) / "projector.pth"
137
+ projector_state_dict = torch.load(projector_path, map_location="cpu")
138
+ proj_arch = reward_config.get("proj_arch") or infer_proj_arch(projector_state_dict)
139
+ torch_dtype = resolve_dtype(reward_config.get("torch_dtype") or "bfloat16")
140
+ attn_implementation = reward_config.get("attn_implementation")
141
+
142
+ tokenizer = AutoTokenizer.from_pretrained(MODEL_DIR,
143
+ trust_remote_code=TRUST_REMOTE_CODE, padding_side="left")
144
+
145
+ model_kwargs = {"trust_remote_code": TRUST_REMOTE_CODE, "torch_dtype": torch_dtype}
146
+ if attn_implementation:
147
+ model_kwargs["attn_implementation"] = attn_implementation
148
+ decoder = AutoModelForCausalLM.from_pretrained(MODEL_DIR, **model_kwargs)
149
+
150
+ projector = Projector(proj_arch, decoder.config.hidden_size,
151
+ decoder.config.hidden_size, torch_dtype == torch.bfloat16)
152
+ projector.load_state_dict(projector_state_dict)
153
+
154
+ device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
155
+ decoder.to(device).eval()
156
+ projector.to(device).eval()
157
+
158
+ issue_inputs = tokenizer(f"<issue>{SAMPLE['issue']}</issue>", padding=False,
159
+ truncation="longest_first")
160
+ patch_inputs = tokenizer(f"<patch>{SAMPLE['patch']}</patch>", padding=False,
161
+ truncation="longest_first")
162
+ review_inputs = tokenizer(SAMPLE["review"], padding=False, truncation="longest_first")
163
+
164
+ input_ids, attention_mask = process_one(
165
+ issue_inputs["input_ids"],
166
+ issue_inputs["attention_mask"],
167
+ patch_inputs["input_ids"],
168
+ patch_inputs["attention_mask"],
169
+ review_inputs["input_ids"],
170
+ review_inputs["attention_mask"],
171
+ max_len=MAX_SEQ_LEN,
172
+ min_review_len=MIN_REVIEW_LEN,
173
+ )
174
+
175
+ inputs = {
176
+ "input_ids": torch.tensor([input_ids], dtype=torch.long, device=device),
177
+ "attention_mask": torch.tensor([attention_mask], dtype=torch.long, device=device),
178
+ }
179
+
180
+ with torch.no_grad():
181
+ hidden_state = decoder(**inputs, output_hidden_states=True).hidden_states[-1]
182
+ reward = torch.sigmoid(projector(hidden_state).squeeze(-1)[:, -1]).item()
183
+
184
+ print(reward)
185
+ ```
186
+
187
+ ## Output
188
+
189
+ The model outputs a single scalar reward score in [0, 1].
190
+
191
+ Typical interpretation:
192
+
193
+ - higher score: better review quality
194
+ - lower score: worse review quality
195
+
196
+ This score is best used for:
197
+
198
+ - ranking candidate reviews
199
+ - pairwise comparison
200
+ - reward modeling in downstream training or reranking
201
+
202
+ ## Intended Use
203
+
204
+ This model is intended for:
205
+
206
+ - code review quality scoring
207
+ - reward modeling for review generation
208
+ - reranking multiple candidate reviews for the same issue and patch
209
+
210
+ ## Limitations
211
+
212
+ - The score is relative, not an absolute guarantee of correctness.
213
+ - Long-input truncation may affect results.
214
+ - The model should not be used as the only signal for production-critical review
215
+ decisions.
216
+
217
+ ## Citation
218
+
219
+ If you use this model, please cite SWE-CARE as appropriate.
220
+
221
+ ```
222
+ @misc{guo2025codefusecrbenchcomprehensivenessawarebenchmarkendtoend,
223
+ title={CodeFuse-CR-Bench: A Comprehensiveness-aware Benchmark for End-to-End Code Review Evaluation in Python Projects},
224
+ author={Hanyang Guo and Xunjin Zheng and Zihan Liao and Hang Yu and Peng DI and Ziyin Zhang and Hong-Ning Dai},
225
+ year={2025},
226
+ eprint={2509.14856},
227
+ archivePrefix={arXiv},
228
+ primaryClass={cs.SE},
229
+ url={https://arxiv.org/abs/2509.14856},
230
+ }
231
+ ```
232
+