Update pipeline tag to `graph-ml` and enhance model card introduction
#1
by
nielsr HF Staff - opened
README.md
CHANGED
|
@@ -1,56 +1,51 @@
|
|
| 1 |
---
|
| 2 |
-
|
|
|
|
| 3 |
datasets:
|
| 4 |
- PKU-ML/Erdos
|
| 5 |
- PKU-ML/Erdos-CoT
|
| 6 |
language:
|
| 7 |
- en
|
|
|
|
|
|
|
| 8 |
metrics:
|
| 9 |
- accuracy
|
| 10 |
-
|
| 11 |
-
- Qwen/Qwen2.5-7B-Instruct
|
| 12 |
-
pipeline_tag: text-generation
|
| 13 |
tags:
|
| 14 |
- graph
|
| 15 |
- chat
|
| 16 |
-
library_name: transformers
|
| 17 |
---
|
| 18 |
|
|
|
|
| 19 |
|
| 20 |
-
|
| 21 |
|
| 22 |
-
|
| 23 |
|
| 24 |
-
|
| 25 |
-
We apply Group Relative Policy Optimization (GRPO) for reinforcement learning with supervised finetuning as a prelimary step.
|
| 26 |
-
|
| 27 |
-
G1 brings the following improvements:
|
| 28 |
|
| 29 |
-
|
| 30 |
-
- **Strong Generalization to unseen graph tasks**: G1 exhibits zero-shot generalization on unseen graph tasks, improving performance on *other graph reasoning benchmarks* (GraphWiz, GraphArena) and *real-world graphs* (Cora, PubMed).
|
| 31 |
-
- **NO Compromise on general reasoning**: Crucially, G1 preserves general reasoning ability (GSM8K, MATH, MMLU-Pro), proving its versatility.
|
| 32 |
|
|
|
|
|
|
|
|
|
|
| 33 |
|
| 34 |
**This repo contains the G1-7B model**, which has the following features:
|
| 35 |
-
-
|
| 36 |
-
-
|
| 37 |
-
-
|
| 38 |
-
-
|
| 39 |
-
-
|
| 40 |
-
|
| 41 |
-
For more details, please refer to our [paper](https://arxiv.org/pdf/2505.18499) and [GitHub](https://github.com/PKU-ML/G1/tree/main).
|
| 42 |
-
|
| 43 |
|
| 44 |
## Requirements
|
| 45 |
|
| 46 |
-
The model is trained based on Qwen/Qwen2.5-7B-Instruct. The code of Qwen2.5 has been in the latest Hugging
|
| 47 |
|
| 48 |
With `transformers<4.37.0`, you will encounter the following error:
|
| 49 |
```
|
| 50 |
KeyError: 'qwen2'
|
| 51 |
```
|
| 52 |
|
| 53 |
-
|
| 54 |
## Quickstart
|
| 55 |
|
| 56 |
Here provides a code snippet with `apply_chat_template` to show you how to load the tokenizer and model and how to generate contents.
|
|
@@ -73,10 +68,18 @@ model = AutoModelForCausalLM.from_pretrained(
|
|
| 73 |
)
|
| 74 |
tokenizer = AutoTokenizer.from_pretrained(model_name)
|
| 75 |
|
| 76 |
-
prompt = "The task is to determine the degree centrality of a node in the graph.
|
| 77 |
-
|
| 78 |
-
|
| 79 |
-
"
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 80 |
"You need to format your answer as a float number."
|
| 81 |
messages = [
|
| 82 |
{"role": "user", "content": INSTRUCTION_TEMPLATE.format(instruction=prompt)}
|
|
@@ -103,17 +106,15 @@ response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
|
|
| 103 |
print(response)
|
| 104 |
```
|
| 105 |
|
| 106 |
-
|
| 107 |
## Evaluation & Performance
|
| 108 |
|
| 109 |
-
Detailed evaluation results are reported in
|
| 110 |
-
|
| 111 |
|
| 112 |
## Citation
|
| 113 |
|
| 114 |
If you find our work helpful, feel free to give us a cite.
|
| 115 |
|
| 116 |
-
```
|
| 117 |
@article{guo2025g1,
|
| 118 |
title={G1: Teaching LLMs to Reason on Graphs with Reinforcement Learning},
|
| 119 |
author={Guo, Xiaojun and Li, Ang and Wang, Yifei and Jegelka, Stefanie and Wang, Yisen},
|
|
|
|
| 1 |
---
|
| 2 |
+
base_model:
|
| 3 |
+
- Qwen/Qwen2.5-7B-Instruct
|
| 4 |
datasets:
|
| 5 |
- PKU-ML/Erdos
|
| 6 |
- PKU-ML/Erdos-CoT
|
| 7 |
language:
|
| 8 |
- en
|
| 9 |
+
library_name: transformers
|
| 10 |
+
license: apache-2.0
|
| 11 |
metrics:
|
| 12 |
- accuracy
|
| 13 |
+
pipeline_tag: graph-ml
|
|
|
|
|
|
|
| 14 |
tags:
|
| 15 |
- graph
|
| 16 |
- chat
|
|
|
|
| 17 |
---
|
| 18 |
|
| 19 |
+
# G1-7B: Teaching LLMs to Reason on Graphs with Reinforcement Learning
|
| 20 |
|
| 21 |
+
This repository contains the G1-7B model, part of the G1 series of large language models presented in the paper [G1: Teaching LLMs to Reason on Graphs with Reinforcement Learning](https://huggingface.co/papers/2505.18499). G1 models are trained on the [Erdos](https://huggingface.co/datasets/PKU-ML/Erdos) benchmark for solving graph reasoning tasks, based on `Qwen2.5-Instruct`. The approach leverages Group Relative Policy Optimization (GRPO) for reinforcement learning, with supervised finetuning as a preliminary step.
|
| 22 |
|
| 23 |
+
Code: [https://github.com/PKU-ML/G1](https://github.com/PKU-ML/G1)
|
| 24 |
|
| 25 |
+
## Introduction
|
|
|
|
|
|
|
|
|
|
| 26 |
|
| 27 |
+
G1 brings the following improvements to graph reasoning with Large Language Models:
|
|
|
|
|
|
|
| 28 |
|
| 29 |
+
- **Significant improvement on graph reasoning**: G1 models achieve up to 46% improvement over baselines on Erdős, with the 7B variant matching OpenAI’s o3-mini and the 3B model surpassing Qwen2.5-72B-Instruct by notable margins.
|
| 30 |
+
- **Strong Generalization to unseen graph tasks**: G1 exhibits zero-shot generalization on unseen graph tasks, improving performance on *other graph reasoning benchmarks* (GraphWiz, GraphArena) and *real-world graphs* (Cora, PubMed).
|
| 31 |
+
- **NO Compromise on general reasoning**: Crucially, G1 preserves general reasoning ability (GSM8K, MATH, MMLU-Pro), proving its versatility.
|
| 32 |
|
| 33 |
**This repo contains the G1-7B model**, which has the following features:
|
| 34 |
+
- Type: Causal Language Models
|
| 35 |
+
- Training Stage: SFT & RL
|
| 36 |
+
- Architecture: the same with Qwen2.5-Instruct
|
| 37 |
+
- Number of Parameters: 7.62B
|
| 38 |
+
- Context Length: Full 32,768 tokens and generation 8192 tokens
|
|
|
|
|
|
|
|
|
|
| 39 |
|
| 40 |
## Requirements
|
| 41 |
|
| 42 |
+
The model is trained based on `Qwen/Qwen2.5-7B-Instruct`. The code of Qwen2.5 has been in the latest Hugging Face `transformers`, and we advise you to use the latest version of `transformers`.
|
| 43 |
|
| 44 |
With `transformers<4.37.0`, you will encounter the following error:
|
| 45 |
```
|
| 46 |
KeyError: 'qwen2'
|
| 47 |
```
|
| 48 |
|
|
|
|
| 49 |
## Quickstart
|
| 50 |
|
| 51 |
Here provides a code snippet with `apply_chat_template` to show you how to load the tokenizer and model and how to generate contents.
|
|
|
|
| 68 |
)
|
| 69 |
tokenizer = AutoTokenizer.from_pretrained(model_name)
|
| 70 |
|
| 71 |
+
prompt = "The task is to determine the degree centrality of a node in the graph.
|
| 72 |
+
|
| 73 |
+
"\
|
| 74 |
+
"Degree centrality for a node is the fraction of nodes it is connected to.
|
| 75 |
+
|
| 76 |
+
"\
|
| 77 |
+
"Here is an undirected graph containing nodes from 1 to 15. The edges are: (1, 15), (15, 11), (2, 3), (2, 6), (3, 6), (3, 7), (6, 7), (6, 8), (7, 8), (7, 14), (4, 10), (10, 5), (10, 12), (8, 14), (8, 9), (12, 11), (12, 13).
|
| 78 |
+
|
| 79 |
+
"\
|
| 80 |
+
"Question: What is the degree centrality of node 2 in the graph?
|
| 81 |
+
|
| 82 |
+
"\
|
| 83 |
"You need to format your answer as a float number."
|
| 84 |
messages = [
|
| 85 |
{"role": "user", "content": INSTRUCTION_TEMPLATE.format(instruction=prompt)}
|
|
|
|
| 106 |
print(response)
|
| 107 |
```
|
| 108 |
|
|
|
|
| 109 |
## Evaluation & Performance
|
| 110 |
|
| 111 |
+
Detailed evaluation results are reported in the [paper](https://huggingface.co/papers/2505.18499).
|
|
|
|
| 112 |
|
| 113 |
## Citation
|
| 114 |
|
| 115 |
If you find our work helpful, feel free to give us a cite.
|
| 116 |
|
| 117 |
+
```bibtex
|
| 118 |
@article{guo2025g1,
|
| 119 |
title={G1: Teaching LLMs to Reason on Graphs with Reinforcement Learning},
|
| 120 |
author={Guo, Xiaojun and Li, Ang and Wang, Yifei and Jegelka, Stefanie and Wang, Yisen},
|