Update SciJudge-30B model card and link 2605 release

#2
by Mubuky - opened
Files changed (1) hide show
  1. README.md +30 -20
README.md CHANGED
@@ -1,32 +1,40 @@
1
  ---
2
  language:
3
- - en
4
  license: apache-2.0
5
  base_model: Qwen/Qwen3-30B-A3B-Instruct-2507
 
 
6
  tags:
7
- - scientific-evaluation
8
- - citation-prediction
9
- - preference-learning
10
- - GRPO
11
- - moe
12
  pipeline_tag: text-generation
13
  library_name: transformers
14
  ---
15
 
16
- # SciJudge-Qwen3-30B
17
 
18
- SciJudge-Qwen3-30B is a fine-tuned language model for **scientific paper evaluation**. Given two academic papers' metadata (title, abstract, publication date), it predicts which paper has a higher citation count — serving as a proxy for assessing research impact and "scientific taste."
19
 
20
- This model is part of the paper: **[AI Can Learn Scientific Taste](https://arxiv.org/abs/2603.14473)**.
 
 
 
 
21
 
22
  ## Usage
23
 
24
  ```python
 
25
  from transformers import AutoModelForCausalLM, AutoTokenizer
26
 
27
  model_name = "OpenMOSS-Team/SciJudge-30B"
28
  tokenizer = AutoTokenizer.from_pretrained(model_name)
29
- model = AutoModelForCausalLM.from_pretrained(model_name, torch_dtype="bfloat16", device_map="auto")
 
 
 
 
30
 
31
  messages = [
32
  {"role": "system", "content": "You are a helpful assistant. You first think about the reasoning process in your mind and then provide the user with the answer."},
@@ -42,20 +50,22 @@ print(response)
42
 
43
  ## Training Details
44
 
45
- - **Base model:** Qwen3-30B-A3B-Instruct-2507 (MoE, 30B total / 3B active)
46
- - **Training method:** GRPO (Generative Reward Policy Optimization) with DAPO loss
47
- - **Training data:** 720,341 preference pairs from arXiv papers
48
- - **Learning rate:** 8e-7 (cosine schedule, 5% warmup)
49
- - **Micro batch size:** 8, global batch size: 1024
50
- - **Optimizer:** Adam (with CPU offload)
51
  - **Precision:** bfloat16
52
- - **KL coefficient (β):** 0.03
53
 
54
  ## Citation
55
 
56
  ```bibtex
57
- @article{scijudge2025,
58
- title={AI Can Learn Scientific Taste},
59
- year={2025}
 
 
 
 
 
60
  }
61
  ```
 
1
  ---
2
  language:
3
+ - en
4
  license: apache-2.0
5
  base_model: Qwen/Qwen3-30B-A3B-Instruct-2507
6
+ datasets:
7
+ - OpenMOSS-Team/SciJudgeBench
8
  tags:
9
+ - scientific-taste
10
+ - GRPO
 
 
 
11
  pipeline_tag: text-generation
12
  library_name: transformers
13
  ---
14
 
15
+ # SciJudge-30B
16
 
17
+ > **Update:** A newer release is available at [SciJudge-30B-2605](https://huggingface.co/OpenMOSS-Team/SciJudge-30B-2605). We recommend using the newer release for current experiments and comparisons.
18
 
19
+ SciJudge-30B is a Qwen3-30B-A3B-Instruct-2507 MoE model fine-tuned for scientific paper evaluation. Given two papers' titles, abstracts, and publication dates, it predicts which paper has higher citation impact.
20
+
21
+ This model is part of [AI Can Learn Scientific Taste](https://arxiv.org/abs/2603.14473). The benchmark dataset is [SciJudgeBench](https://huggingface.co/datasets/OpenMOSS-Team/SciJudgeBench).
22
+
23
+ Resources: [Project page](https://tongjingqi.github.io/AI-Can-Learn-Scientific-Taste/) and [GitHub repository](https://github.com/tongjingqi/AI-Can-Learn-Scientific-Taste).
24
 
25
  ## Usage
26
 
27
  ```python
28
+ import torch
29
  from transformers import AutoModelForCausalLM, AutoTokenizer
30
 
31
  model_name = "OpenMOSS-Team/SciJudge-30B"
32
  tokenizer = AutoTokenizer.from_pretrained(model_name)
33
+ model = AutoModelForCausalLM.from_pretrained(
34
+ model_name,
35
+ torch_dtype=torch.bfloat16,
36
+ device_map="auto",
37
+ )
38
 
39
  messages = [
40
  {"role": "system", "content": "You are a helpful assistant. You first think about the reasoning process in your mind and then provide the user with the answer."},
 
50
 
51
  ## Training Details
52
 
53
+ - **Base model:** [Qwen/Qwen3-30B-A3B-Instruct-2507](https://huggingface.co/Qwen/Qwen3-30B-A3B-Instruct-2507)
54
+ - **Training method:** GRPO with DAPO loss
55
+ - **Reward:** external preference reward for citation-based pairwise judgment
 
 
 
56
  - **Precision:** bfloat16
57
+ - **KL coefficient:** 0.03
58
 
59
  ## Citation
60
 
61
  ```bibtex
62
+ @misc{tong2026ailearnscientifictaste,
63
+ title={AI Can Learn Scientific Taste},
64
+ author={Jingqi Tong and Mingzhe Li and Hangcheng Li and Yongzhuo Yang and Yurong Mou and Weijie Ma and Zhiheng Xi and Hongji Chen and Xiaoran Liu and Qinyuan Cheng and Ming Zhang and Qiguang Chen and Weifeng Ge and Qipeng Guo and Tianlei Ying and Tianxiang Sun and Yining Zheng and Xinchi Chen and Jun Zhao and Ning Ding and Xuanjing Huang and Yugang Jiang and Xipeng Qiu},
65
+ year={2026},
66
+ eprint={2603.14473},
67
+ archivePrefix={arXiv},
68
+ primaryClass={cs.CL},
69
+ url={https://arxiv.org/abs/2603.14473},
70
  }
71
  ```