How to use from
Docker Model Runner
docker model run hf.co/PKU-ML/GRASP-4B
Quick Links

PKU-ML/GRASP-4B

📊 Overview

Integrating graph knowledge into Large Language Models (LLMs) via passive representation faces critical bottlenecks: limited context windows, unreliable numerical computation, and structural hallucinations. To solve this, we propose GRASP (Graph Reasoning via Agentic Solving and Probing), shifting the paradigm from passive ingestion to proactive agentic exploration. By interleaving Neighbor Retrieval for on-demand probing with Code Interpreter as a deterministic solver, GRASP enables LLMs to autonomously navigate and compute over complex topologies. We employ a staged reinforcement learning strategy (GRPO) that transitions from visible tuning to a structure-blind environment, forcing the agent to develop genuine topological awareness. Evaluated on multi-domain graph reasoning benchmarks, our 4B model achieves a 53.06% average performance boost, surpassing SOTA baselines like DeepSeek-V3.2 and successfully generalizing to unseen tasks, with high potential for tackling sampling on million-node graphs and solving Hard-level LeetCode graph problems.

📌 Key Takeaways

1️⃣ Agentic Probing over Passive Ingestion. We propose GRASP (Graph Reasoning via AgenticSolving and Probing), shifting the paradigm from passive ingestion to proactive agentic exploration. By interleaving Neighbor Retrieval (Eyes 👀) for on-demand probing with Code Interpreter (Hands 🙌) as a deterministic solver, GRASP enables LLMs to autonomously navigate and compute over complex topologies.

2️⃣ Structure-Blind RL Training. We employ a staged reinforcement learning strategy (GRPO) that transitions from visible tuning to a structure-blind environment, forcing the agent to develop genuine topological awareness.

3️⃣ From Million-Node Graphs to Hard LeetCode. Evaluated on multi-domain graph reasoning benchmarks, our 4B model achieves a 53.06% average performance boost, surpassing SOTA baselines like DeepSeek-V3.2 and successfully generalizing to unseen tasks, with high potential for tackling sampling on million-node graphs and solving Hard-level LeetCode graph problems.

🌊 Evaluation on Graph Reasoning Benchmarks

Model Arxiv PubMed Products WikiCS fb15k237 wn18rr TSG-Bench ExplaGraphs Erdős RealErdős Average
Qwen3-4B-Thinking 51.00 25.00 21.00 29.00 16.00 13.00 62.00 45.00 38.80 7.11 30.79
GPT-4o 52.00 43.00 72.00 24.00 52.00 24.00 72.00 77.00 40.60 18.07 47.46
DeepsSeek-V3.2 65.00 47.00 70.00 79.00 65.00 26.00 88.00 99.00 83.60 66.44 68.90
GRASP-4B 73.00 90.00 77.00 88.00 82.00 67.00 85.00 97.00 91.00 88.57 83.85

Quickstart

The code of Qwen3 has been in the latest Hugging Face transformers and we advise you to use the latest version of transformers.

With transformers<4.51.0, you will encounter the following error:

KeyError: 'qwen3'

The following contains a code snippet illustrating how to use the model generate content based on given inputs.

from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = "PKU-ML/GRASP-4B"

# load the tokenizer and the model
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype="auto",
    device_map="auto"
)

# prepare the model input
prompt = "Give me a short introduction to large language model."
messages = [
    {"role": "user", "content": prompt}
]
text = tokenizer.apply_chat_template(
    messages,
    tokenize=False,
    add_generation_prompt=True,
)
model_inputs = tokenizer([text], return_tensors="pt").to(model.device)

# conduct text completion
generated_ids = model.generate(
    **model_inputs,
    max_new_tokens=8192
)
output_ids = generated_ids[0][len(model_inputs.input_ids[0]):].tolist() 

# parsing thinking content
try:
    # rindex finding 151668 (</think>)
    index = len(output_ids) - output_ids[::-1].index(151668)
except ValueError:
    index = 0

thinking_content = tokenizer.decode(output_ids[:index], skip_special_tokens=True).strip("\n")
content = tokenizer.decode(output_ids[index:], skip_special_tokens=True).strip("\n")

print("thinking content:", thinking_content) # no opening <think> tag
print("content:", content)

Agentic Use

For the specific tool configuration and agentic usages of GRASP, please refer to our example on Github.

Citation

If you find our work helpful, feel free to give us a cite.


Downloads last month
29
Safetensors
Model size
4B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for PKU-ML/GRASP-4B

Finetuned
(234)
this model
Quantizations
1 model

Collection including PKU-ML/GRASP-4B