ProGraph / README.md

datasets commit

db1be90 over 1 year ago

4.39 kB

	---
	license: mit
	language:
	- en
	metrics:
	- accuracy
	- pass rate
	base_model:
	- meta-llama/Meta-Llama-3-8B-Instruct
	- deepseek-ai/deepseek-coder-7b-instruct-v1.5
	library_name: transformers, alignment-handbook
	pipeline_tag: question-answering
	---

	### 1. Introduction of this repository

	Official Repository of "Can Large Language Models Analyze Graphs like Professionals? A Benchmark, Datasets and Models". NeurIPS 2024

	- Paper Link: (https://arxiv.org/abs/2409.19667/)
	- GitHub Repository: (https://github.com/BUPT-GAMMA/ProGraph)


	### 2. Pipelines and Experimental Results

	#### The pipeline of ProGraph benchmark construction

	<img width="1000px" alt="" src="figures/figure_1_the_pipeline_of_ProGraph_benchmark_construction.jpg">

	#### The pipeline of LLM4Graph dataset construction and corresponding model enhancement.

	<img width="1000px" alt="" src="figures/figure_2_the_pipeline_of_LLM4Graph_dataset_construction_and_corresponding_model_enhancement.jpg">

	#### The pass rate (left) and accuracy (right) of open-source models with instruction tuning.

	<img width="1000px" alt="" src="figures/figure_4_the_pass rate_and_accuracy_of_open-source_models_withe_instruction_tuning.jpg">

	#### Compilation error statistics for open source models.

	<img width="1000px" alt="" src="figures/figure_6_compilation_error_statistics_for_open-source_models.jpg">

	#### Performance (%) of open-source models regarding different question types.

	\| Model \| Method \| True/False \| \| Drawing \| \| Calculation \| \| Hybrid \| \|
	\| --- \| --- \| :---: \| :---: \| :---: \| :---: \| :---: \| :---: \| :---: \| :---: \|
	\| \| \| Pass Rate \| Accuracy \| Pass Rate \| Accuracy \| Pass Rate \| Accuracy \| Pass Rate \| Accuracy \|
	\| Llama 3 \| No Fine-tune \| 43.6 \| 33.3 \| 28.3 \| 10.0 \| 15.6 \| 12.5 \| 26.8 \| 8.3 \|
	\| \| Code Only \| 82.1 \| 71.8 \| 59.2 \| 42.0 \| 34.4 \| 31.3 \| 60.7 \| 43.6 \|
	\| \| Code+RAG 3 \| 84.6 \| 44.0 \| 56.9 \| 29.0 \| 50.0 \| 37.5 \| 66.1 \| 37.2 \|
	\| \| Code+RAG 5 \| 66.7 \| 36.8 \| 53.5 \| 25.4 \| 37.5 \| 28.1 \| 60.7 \| 36.3 \|
	\| \| Code+RAG 7 \| 66.7 \| 37.2 \| 50.9 \| 24.4 \| 50.0 \| 35.9 \| 64.3 \| 39.3 \|
	\| \| Doc+Code \| 82.1 \| 73.1 \| 64.4 \| 43.7 \| 40.6 \| 31.8 \| 67.9 \| 41.3 \|
	\| Deepseek Coder \| No Fine-tune \| 66.7 \| 41.5 \| 47.8 \| 22.1 \| 53.1 \| 39.4 \| 46.4 \| 18.2 \|
	\| \| Code Only \| 71.8 \| 61.5 \| 60.0 \| 41.1 \| 50.0 \| 45.3 \| 62.5 \| 42.1 \|
	\| \| Code+RAG 3 \| 71.8 \| 48.3 \| 57.7 \| 32.2 \| 53.1 \| 45.3 \| 44.6 \| 22.8 \|
	\| \| Code+RAG 5 \| 71.8 \| 53.9 \| 50.7 \| 29.3 \| 40.6 \| 34.4 \| 39.3 \| 28.6 \|
	\| \| Code+RAG 7 \| 74.4 \| 54.7 \| 50.4 \| 28.7 \| 37.5 \| 34.4 \| 48.2 \| 31.4 \|
	\| \| Doc+Code \| 79.5 \| 68.0 \| 66.2 \| 46.0 \| 37.5 \| 34.4 \| 66.1 \| 42.3 \|


	### 3. How to Use
	Here give some examples of how to use our models.
	#### Chat Model Inference
	```python
	import torch
	from transformers import AutoTokenizer, AutoModelForCausalLM, StoppingCriteria, StoppingCriteriaList
	from peft import PeftModel

	device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")

	model_name_or_path = '../models/deepseek-ai/deepseek-coder-7b-instruct-v1.5'
	# You can use Llama-3-8B by 'meta-llama/Meta-Llama-3-8B-Instruct'.
	# You can also use your local path.
	peft_model_path = 'https://huggingface.co/lixin4sky/ProGraph/tree/main/deepseek-code-only'
	# Or other models in the repository.

	tokenizer = AutoTokenizer.from_pretrained(model_name_or_path)
	model = AutoModelForCausalLM.from_pretrained(model_name_or_path).to(device)
	peft_model = PeftModel.from_pretrained(model, peft_model_path).to(device)

	input_text = '' # the question.

	message = [
	{"role": "user", "content": f"{input_text}"},
	]

	input_ids = tokenizer.apply_chat_template(conversation=message,
	tokenize=True,
	add_generation_prompt=False,
	return_tensors='pt')

	input_ids = input_ids.to("cuda:0" if torch.cuda.is_available() else "cpu")
	with torch.inference_mode():
	output_ids = model.generate(input_ids=input_ids[:, :-3], max_new_tokens=4096, do_sample=False, pad_token_id=2)
	response = tokenizer.batch_decode(output_ids.detach().cpu().numpy(), skip_special_tokens = True)

	print(response)
	```

	You can find more tutorials in our GitHub repository: (https://github.com/BUPT-GAMMA/ProGraph)

	### 4. Next Level
	- GraphTeam: (https://arxiv.org/abs/2410.18032)
	- Github Repository: (https://github.com/BUPT-GAMMA/GraphTeam)