PKU-ML commited on
Commit
bdf0762
·
verified ·
1 Parent(s): eb89784

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +128 -3
README.md CHANGED
@@ -1,3 +1,128 @@
1
- ---
2
- license: apache-2.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ language:
4
+ - en
5
+ base_model:
6
+ - Qwen/Qwen3-4B-Thinking-2507
7
+ library_name: transformers
8
+ ---
9
+
10
+
11
+
12
+
13
+ <p align="center">
14
+ <img src="https://raw.githubusercontent.com/PKU-ML/GRASP/main/logo-new.png" width="15%"/>
15
+ <p>
16
+
17
+ # PKU-ML/GRASP-base-4B
18
+
19
+ ## 📊 Overview
20
+
21
+ Integrating graph knowledge into Large Language Models (LLMs) via passive representation faces critical bottlenecks: limited context windows, unreliable numerical computation, and structural hallucinations.
22
+ To solve this, we propose **GRASP** (Graph Reasoning via Agentic Solving and Probing), shifting the paradigm from passive ingestion to proactive agentic exploration.
23
+ By interleaving **Neighbor Retrieval** for on-demand probing with **Code Interpreter** as a deterministic solver, GRASP enables LLMs to autonomously navigate and compute over complex topologies.
24
+ We employ a staged reinforcement learning strategy (GRPO) that transitions from visible tuning to a structure-blind environment, forcing the agent to develop genuine topological awareness.
25
+ Evaluated on multi-domain graph reasoning benchmarks, our 4B model achieves a 53.06% average performance boost, surpassing SOTA baselines like DeepSeek-V3.2 and successfully generalizing to unseen tasks,
26
+ with high potential for tackling sampling on million-node graphs and solving Hard-level LeetCode graph problems.
27
+
28
+
29
+
30
+ ## 📌 Key Takeaways
31
+
32
+ 1️⃣ **Agentic Probing over Passive Ingestion**.
33
+ We propose GRASP (Graph Reasoning via AgenticSolving and Probing), shifting the paradigm from passive ingestion to proactive agentic exploration. By interleaving Neighbor Retrieval (Eyes 👀) for on-demand probing with Code Interpreter (Hands 🙌) as a deterministic solver, GRASP enables LLMs to autonomously navigate and compute over complex topologies.
34
+
35
+ 2️⃣ **Structure-Blind RL Training**.
36
+ We employ a staged reinforcement learning strategy (GRPO) that transitions from visible tuning to a structure-blind environment, forcing the agent to develop genuine topological awareness.
37
+
38
+ 3️⃣ **From Million-Node Graphs to Hard LeetCode**.
39
+ Evaluated on multi-domain graph reasoning benchmarks, our 4B model achieves a 53.06% average performance boost, surpassing SOTA baselines like DeepSeek-V3.2 and successfully generalizing to unseen tasks, with high potential for tackling sampling on million-node graphs and solving Hard-level LeetCode graph problems.
40
+
41
+
42
+
43
+
44
+ ## 🌊 Evaluation on Graph Reasoning Benchmarks
45
+
46
+
47
+ | Model | Arxiv |PubMed |Products | WikiCS | fb15k237 |wn18rr |TSG-Bench |ExplaGraphs |Erdős |RealErdős |Average |
48
+ |------------------|-----------|-----------|-----------|-----------|------------|-----------|------------|------------|------------|------------|------------|
49
+ | Qwen3-4B-Thinking|51.00 |25.00 |21.00 |29.00 |16.00 |13.00 |62.00 |45.00 |38.80 |7.11 |30.79 |
50
+ | GPT-4o |52.00 |43.00 |72.00 |24.00 |52.00 |24.00 |72.00 |77.00 |40.60 |18.07 |47.46 |
51
+ | DeepsSeek-V3.2 |65.00 |47.00 |70.00 |79.00 |65.00 |26.00 |**88.00** |**99.00** |83.60 |66.44 |68.90 |
52
+ | GRASP-base-4B |**69.00** |**91.00** |**78.00** |**88.00** |**86.00** |**68.00** |85.00 |95.00 |**89.40** |**86.22** |**83.56** |
53
+
54
+
55
+
56
+
57
+
58
+
59
+ ## Quickstart
60
+
61
+ The code of Qwen3 has been in the latest Hugging Face `transformers` and we advise you to use the latest version of `transformers`.
62
+
63
+ With `transformers<4.51.0`, you will encounter the following error:
64
+ ```
65
+ KeyError: 'qwen3'
66
+ ```
67
+
68
+ The following contains a code snippet illustrating how to use the model generate content based on given inputs.
69
+ ```python
70
+ from transformers import AutoModelForCausalLM, AutoTokenizer
71
+
72
+ model_name = "PKU-ML/GRASP-base-4B"
73
+
74
+ # load the tokenizer and the model
75
+ tokenizer = AutoTokenizer.from_pretrained(model_name)
76
+ model = AutoModelForCausalLM.from_pretrained(
77
+ model_name,
78
+ torch_dtype="auto",
79
+ device_map="auto"
80
+ )
81
+
82
+ # prepare the model input
83
+ prompt = "Give me a short introduction to large language model."
84
+ messages = [
85
+ {"role": "user", "content": prompt}
86
+ ]
87
+ text = tokenizer.apply_chat_template(
88
+ messages,
89
+ tokenize=False,
90
+ add_generation_prompt=True,
91
+ )
92
+ model_inputs = tokenizer([text], return_tensors="pt").to(model.device)
93
+
94
+ # conduct text completion
95
+ generated_ids = model.generate(
96
+ **model_inputs,
97
+ max_new_tokens=8192
98
+ )
99
+ output_ids = generated_ids[0][len(model_inputs.input_ids[0]):].tolist()
100
+
101
+ # parsing thinking content
102
+ try:
103
+ # rindex finding 151668 (</think>)
104
+ index = len(output_ids) - output_ids[::-1].index(151668)
105
+ except ValueError:
106
+ index = 0
107
+
108
+ thinking_content = tokenizer.decode(output_ids[:index], skip_special_tokens=True).strip("\n")
109
+ content = tokenizer.decode(output_ids[index:], skip_special_tokens=True).strip("\n")
110
+
111
+ print("thinking content:", thinking_content) # no opening <think> tag
112
+ print("content:", content)
113
+
114
+ ```
115
+
116
+ ## Agentic Use
117
+
118
+ For the specific tool configuration and agentic usages of GRASP, please refer to our [example](https://github.com/PKU-ML/GRASP/blob/main/evaluation/example.py) on Github.
119
+
120
+
121
+
122
+ ## Citation
123
+
124
+ If you find our work helpful, feel free to give us a cite.
125
+
126
+ ```
127
+
128
+ ```