nielsr's picture
nielsr HF Staff
Update model card for Think-on-Graph 3.0 framework (RAG-Factory)
c258334 verified
|
raw
history blame
7.66 kB
metadata
base_model:
  - Qwen/Qwen2-7B-Instruct
datasets:
  - IDEA-FinAI/Golden-Touchstone
language:
  - en
  - zh
library_name: transformers
license: apache-2.0
pipeline_tag: text-generation
tags:
  - finance
  - text-generation-inference
  - retrieval-augmented-generation
  - rag
  - graph-neural-networks
  - llm-reasoning
RAG-Factory Logo

✨ TouchstoneGPT-7B-Instruct: A Model for Think-on-Graph 3.0 via RAG-Factory

This Hugging Face repository hosts the TouchstoneGPT-7B-Instruct model, an instance of a Large Language Model (LLM) based on Qwen/Qwen2-7B-Instruct. This model is suitable for integration within the Think-on-Graph 3.0 (ToG-3) framework, a novel approach to Retrieval-Augmented Generation (RAG) that enhances LLM reasoning on heterogeneous graphs. The ToG-3 framework is implemented and further detailed in the RAG-Factory GitHub repository.

Paper Abstract: Think-on-Graph 3.0

Retrieval-Augmented Generation (RAG) and Graph-based RAG has become the important paradigm for enhancing Large Language Models (LLMs) with external knowledge. However, existing approaches face a fundamental trade-off. While graph-based methods are inherently dependent on high-quality graph structures, they face significant practical constraints: manually constructed knowledge graphs are prohibitively expensive to scale, while automatically extracted graphs from corpora are limited by the performance of the underlying LLM extractors, especially when using smaller, local-deployed models. This paper presents Think-on-Graph 3.0 (ToG-3), a novel framework that introduces Multi-Agent Context Evolution and Retrieval (MACER) mechanism to overcome these limitations. Our core innovation is the dynamic construction and refinement of a Chunk-Triplets-Community heterogeneous graph index, which pioneeringly incorporates a dual-evolution mechanism of Evolving Query and Evolving Sub-Graph for precise evidence retrieval. This approach addresses a critical limitation of prior Graph-based RAG methods, which typically construct a static graph index in a single pass without adapting to the actual query. A multi-agent system, comprising Constructor, Retriever, Reflector, and Responser agents, collaboratively engages in an iterative process of evidence retrieval, answer generation, sufficiency reflection, and, crucially, evolving query and subgraph. This dual-evolving multi-agent system allows ToG-3 to adaptively build a targeted graph index during reasoning, mitigating the inherent drawbacks of static, one-time graph construction and enabling deep, precise reasoning even with lightweight LLMs. Extensive experiments demonstrate that ToG-3 outperforms compared baselines on both deep and broad reasoning benchmarks, and ablation studies confirm the efficacy of the components of MACER framework.

✨ Features of RAG-Factory (Think-on-Graph 3.0 Implementation)

The RAG-Factory framework, which implements the concepts of Think-on-Graph 3.0, provides a factory for building advanced RAG pipelines, including:

  • Standard RAG implementations
  • GraphRAG architectures
  • Multi-modal RAG systems
Example Knowledge Base Screenshot of RAG-Factory

Key features include:

  • Modular design for easy customization
  • Support for various knowledge graph backends
  • Integration with multiple LLM providers
  • Configurable pipeline components

Installation (for RAG-Factory)

To set up the RAG-Factory environment, clone the repository and install dependencies:

pip install -e .

Usage (for RAG-Factory)

You can run predefined RAG pipelines using the RAG-Factory framework:

bash run.sh naive_rag/graph_rag/mm_rag

or

python main.py --config examples/graphrag/config.yaml

For more examples and detailed configurations, please refer to the examples/ directory in the RAG-Factory GitHub repository.

Usage of TouchstoneGPT-7B-Instruct

This TouchstoneGPT-7B-Instruct model is a Qwen2-7B-Instruct-based LLM that can be used for text generation tasks, either standalone or as a component within RAG frameworks like Think-on-Graph 3.0. Below is a code snippet using the transformers library to load the tokenizer and model and generate content.

from transformers import AutoModelForCausalLM, AutoTokenizer
device = "cuda" # the device to load the model onto

model = AutoModelForCausalLM.from_pretrained(
    "IDEA-FinAI/TouchstoneGPT-7B-Instruct",
    torch_dtype="auto",
    device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained("IDEA-FinAI/TouchstoneGPT-7B-Instruct")

prompt = "What is the sentiment of the following financial post: Positive, Negative, or Neutral?
sees #Apple at $150/share in a year (+36% from today) on growing services business."
messages = [
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": prompt}
]
text = tokenizer.apply_chat_template(
    messages,
    tokenize=False,
    add_generation_prompt=True
)
model_inputs = tokenizer([text], return_tensors="pt").to(device)

generated_ids = model.generate(
    model_inputs.input_ids,
    max_new_tokens=512
)
generated_ids = [
    output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
]

response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]

Citation

If you find our work on Think-on-Graph 3.0 useful for your research and applications, please consider citing the paper:

@misc{wu2025ToG-3,
      title={Think-on-Graph 3.0: Efficient and Adaptive LLM Reasoning on Heterogeneous Graphs via Multi-Agent Dual-Evolving Context Retrieval}, 
      author={Xiaojun Wu, Cehao Yang, Xueyuan Lin, Chengjin Xu, Xuhui Jiang, Yuanliang Sun, Hui Xiong, Jia Li, Jian Guo},
      year={2025},
      eprint={2509.21710},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2509.21710}, 
}