Safetensors
English
llama

Improve model card: Add pipeline tag, library name, paper, and code links

#1
by nielsr HF Staff - opened
Files changed (1) hide show
  1. README.md +46 -16
README.md CHANGED
@@ -1,19 +1,28 @@
1
  ---
2
- license: mit
 
3
  datasets:
4
  - AceSearcher/Search-SFT
5
  - AceSearcher/Search-RFT-Prompts
6
  language:
7
  - en
8
- base_model:
9
- - meta-llama/Llama-3.1-8B-Instruct
 
10
  ---
11
- ## Introduction
12
- Here is the checkpoint used in the paper **AceSearcher: Bootstrapping Reasoning and Search for LLMs via Reinforced Self-Play**. It uses `Llama-3.1-8B` as the backbone.
 
 
 
 
 
 
 
13
 
14
  ## Model Usage
15
  For question decomposition on QA tasks:
16
- ```
17
  from vllm import LLM, SamplingParams
18
  model_path = "AceSearcher/AceSearcher-8B"
19
 
@@ -41,7 +50,7 @@ generated_text = outputs[0].outputs[0].text
41
  ```
42
 
43
  For question decomposition on fact verification tasks:
44
- ```
45
  prompt_plan_claim = """Please break down the claim "{claim}" into multiple smaller sub-claims that each focus on a specific component of the original statement, making it easier for a model to verify.
46
  Begin each sub-claim with ###. If needed, refer to answers from earlier sub-claims using #1, #2, etc.
47
  Decomposed claim:"""
@@ -64,7 +73,7 @@ generated_text = outputs[0].outputs[0].text
64
  ```
65
 
66
  For question answering for subquestions:
67
- ```
68
  prompt = f"""You have the following context passages:
69
  {context_text}
70
 
@@ -73,7 +82,7 @@ If no answer is found in the context, use your own knowledge. Your answer needs
73
  ```
74
 
75
  For fact verification tasks for subquestions:
76
- ```
77
  prompt = f"""You have the following context passages:
78
  {context_text}
79
 
@@ -83,7 +92,7 @@ Please only output Yes or No and do not give any explanation."""
83
  ```
84
 
85
  For question answering to generate the final answer:
86
- ```
87
  prompt = f"""You have the following passages:
88
  {passages}
89
 
@@ -96,7 +105,7 @@ Wrap your answer with <answer> and </answer> tags."""
96
  ```
97
 
98
  For fact verification tasks to generate the final answer:
99
- ```
100
  prompt = f"""You have the following passages:
101
  {passages}
102
 
@@ -108,15 +117,36 @@ Wrap your answer with <answer> and </answer> tags."""
108
  ```
109
 
110
  For Decomposition for document-level financial reasoning tasks:
111
- ```
112
- decompose_prompt = """You have the following passages and table:\nPassages:\n{passage}\nPlease break down the question '{question}' into multiple specific sub-questions that address individual components of the original question, with the table and passages as the reference. Use ### to mark the start of each sub-question."""
 
 
 
113
 
114
- qa_prompt = """You have the following passages and table:\nPassages:\n{passage}\nFor the question '{question}', here is a referenced breakdown:\n{decompose}.\n\nWrite a Python program to solve the question. Store the final result in the variable ans."""
 
 
 
 
 
 
115
 
116
 
117
  question = "What would the change in furniture and fixtures between 2018 and 2019 be if furniture and fixtures were $5,000 thousand in 2018 instead? (in thousand)"
118
 
119
- context_text = "\n|||December 31,||\n||Useful Life|2019|2018|\n|Computer equipment and software|3 \u2013 5 years|$57,474|$52,055|\n|Furniture and fixtures|7 years|6,096|4,367|\n|Leasehold improvements|2 \u2013 6 years|22,800|9,987|\n|Renovation in progress|n/a|8|1,984|\n|Build-to-suit property|25 years|\u2014|51,058|\n|Total property and equipment, gross||86,378|119,451|\n|Less: accumulated depreciation and amortization||(49,852)|(42,197)|\n|Total property and equipment, net||$36,526|$77,254|\n 7. OTHER BALANCE SHEET AMOUNTS The components of property and equipment, net is as follows (in thousands): Depreciation expense for the years ended December 31, 2019, 2018, and 2017 was $11.8 million, $10.2 million, and $10.3 million, respectively.\n"
 
 
 
 
 
 
 
 
 
 
 
 
120
 
121
  decompose_prompt = decompose_prompt.replace("{passage}" , context_text)
122
  decompose_prompt = decompose_prompt.replace("{question}", question)
@@ -135,7 +165,7 @@ output = llm.generate(prompt, sampling_params)[0].outputs[0].text
135
  ## Citation
136
  If you find our paper or models helpful, please consider cite as follows. Thank you!
137
 
138
- ```
139
  @inproceedings{
140
  xu2025acesearcher,
141
  title={AceSearcher: Bootstrapping Reasoning and Search for LLMs via Reinforced Self-Play},
 
1
  ---
2
+ base_model:
3
+ - meta-llama/Llama-3.1-8B-Instruct
4
  datasets:
5
  - AceSearcher/Search-SFT
6
  - AceSearcher/Search-RFT-Prompts
7
  language:
8
  - en
9
+ license: mit
10
+ pipeline_tag: text-generation
11
+ library_name: transformers
12
  ---
13
+
14
+ # AceSearcher: Bootstrapping Reasoning and Search for LLMs via Reinforced Self-Play
15
+
16
+ This repository contains the checkpoint for **AceSearcher-8B**, a large language model developed in the paper [AceSearcher: Bootstrapping Reasoning and Search for LLMs via Reinforced Self-Play](https://huggingface.co/papers/2509.24193). It uses `Llama-3.1-8B` as its backbone.
17
+
18
+ For the official code and more details, visit the [GitHub repository](https://github.com/ritaranx/AceSearcher/).
19
+
20
+ ## Paper Abstract
21
+ Search-augmented LLMs often struggle with complex reasoning tasks due to ineffective multi-hop retrieval and limited reasoning ability. We propose AceSearcher, a cooperative self-play framework that trains a single large language model (LLM) to alternate between two roles: a decomposer that breaks down complex queries and a solver that integrates retrieved contexts for answer generation. AceSearcher couples supervised fine-tuning on a diverse mixture of search, reasoning, and decomposition tasks with reinforcement fine-tuning optimized for final answer accuracy, eliminating the need for intermediate annotations. Extensive experiments on three reasoning-intensive tasks across 10 datasets show that AceSearcher outperforms state-of-the-art baselines, achieving an average exact match improvement of 7.6%. Remarkably, on document-level finance reasoning tasks, AceSearcher-32B matches the performance of the DeepSeek-V3 model using less than 5% of its parameters. Even at smaller scales (1.5B and 8B), AceSearcher often surpasses existing search-augmented LLMs with up to 9x more parameters, highlighting its exceptional efficiency and effectiveness in tackling complex reasoning tasks. Our code will be published at this https URL and this https URL .
22
 
23
  ## Model Usage
24
  For question decomposition on QA tasks:
25
+ ```python
26
  from vllm import LLM, SamplingParams
27
  model_path = "AceSearcher/AceSearcher-8B"
28
 
 
50
  ```
51
 
52
  For question decomposition on fact verification tasks:
53
+ ```python
54
  prompt_plan_claim = """Please break down the claim "{claim}" into multiple smaller sub-claims that each focus on a specific component of the original statement, making it easier for a model to verify.
55
  Begin each sub-claim with ###. If needed, refer to answers from earlier sub-claims using #1, #2, etc.
56
  Decomposed claim:"""
 
73
  ```
74
 
75
  For question answering for subquestions:
76
+ ```python
77
  prompt = f"""You have the following context passages:
78
  {context_text}
79
 
 
82
  ```
83
 
84
  For fact verification tasks for subquestions:
85
+ ```python
86
  prompt = f"""You have the following context passages:
87
  {context_text}
88
 
 
92
  ```
93
 
94
  For question answering to generate the final answer:
95
+ ```python
96
  prompt = f"""You have the following passages:
97
  {passages}
98
 
 
105
  ```
106
 
107
  For fact verification tasks to generate the final answer:
108
+ ```python
109
  prompt = f"""You have the following passages:
110
  {passages}
111
 
 
117
  ```
118
 
119
  For Decomposition for document-level financial reasoning tasks:
120
+ ```python
121
+ decompose_prompt = """You have the following passages and table:\
122
+ Passages:\
123
+ {passage}\
124
+ Please break down the question '{question}' into multiple specific sub-questions that address individual components of the original question, with the table and passages as the reference. Use ### to mark the start of each sub-question."""
125
 
126
+ qa_prompt = """You have the following passages and table:\
127
+ Passages:\
128
+ {passage}\
129
+ For the question '{question}', here is a referenced breakdown:\
130
+ {decompose}.\
131
+ \
132
+ Write a Python program to solve the question. Store the final result in the variable ans."""
133
 
134
 
135
  question = "What would the change in furniture and fixtures between 2018 and 2019 be if furniture and fixtures were $5,000 thousand in 2018 instead? (in thousand)"
136
 
137
+ context_text = "\
138
+ |||December 31,||\
139
+ ||Useful Life|2019|2018|\
140
+ |Computer equipment and software|3 \\u2013 5 years|$57,474|$52,055|\
141
+ |Furniture and fixtures|7 years|6,096|4,367|\
142
+ |Leasehold improvements|2 \\u2013 6 years|22,800|9,987|\
143
+ |Renovation in progress|n/a|8|1,984|\
144
+ |Build-to-suit property|25 years|\\u2014|51,058|\
145
+ |Total property and equipment, gross||86,378|119,451|\
146
+ |Less: accumulated depreciation and amortization||(49,852)|(42,197)|\
147
+ |Total property and equipment, net||$36,526|$77,254|\
148
+ 7. OTHER BALANCE SHEET AMOUNTS The components of property and equipment, net is as follows (in thousands): Depreciation expense for the years ended December 31, 2019, 2018, and 2017 was $11.8 million, $10.2 million, and $10.3 million, respectively.\
149
+ "
150
 
151
  decompose_prompt = decompose_prompt.replace("{passage}" , context_text)
152
  decompose_prompt = decompose_prompt.replace("{question}", question)
 
165
  ## Citation
166
  If you find our paper or models helpful, please consider cite as follows. Thank you!
167
 
168
+ ```bibtex
169
  @inproceedings{
170
  xu2025acesearcher,
171
  title={AceSearcher: Bootstrapping Reasoning and Search for LLMs via Reinforced Self-Play},