AutoBool-Qwen4b-Reasoning

This model is part of the AutoBool framework, a reinforcement learning approach for training large language models to generate high-quality Boolean queries for systematic literature reviews.

Model Description

This variant uses explicit chain-of-thought reasoning. The model is instructed to provide detailed reasoning about the query construction process inside <think></think> tags before generating the final Boolean query.

Base Model: Qwen/Qwen2.5-4B
Training Method: GRPO (Group Relative Policy Optimization) with LoRA fine-tuning
Prompt Strategy: Chain-of-thought reasoning
- System instruction: "Your reasoning process should be enclosed within <think></think>, and the final Boolean query must be enclosed within <answer></answer> tags"
- Output format: <think>[Detailed step-by-step reasoning explaining the query construction process]</think><answer>[Boolean query]</answer>
- Provides full explanation of term selection, MeSH terms, field tags, wildcards, and Boolean logic
Domain: Biomedical literature search (PubMed)
Task: Boolean query generation for high-recall retrieval

🚀 Interactive Demo

Try out our query generation models directly in your browser! The demo allows you to test our different reasoning strategies (Standard, Conceptual, Objective, and No-Reasoning) in real-time.

Live Demo: AutoBool on Hugging Face Spaces

Training Details

The model was trained using:

Optimization: GRPO (Group Relative Policy Optimization)
Fine-tuning: LoRA (Low-Rank Adaptation)
Dataset: wshuai190/pubmed-pmc-sr-filtered
Reward Function: Combines syntactic validity, format correctness, and retrieval effectiveness
Reasoning Approach: Explicit thinking process with structured tags

Intended Use

This model is designed for:

Generating Boolean queries for systematic literature reviews
High-recall biomedical information retrieval
Supporting evidence synthesis in healthcare and biomedical research
Applications where reasoning transparency is valuable

How to Use

from transformers import AutoTokenizer, AutoModelForCausalLM
import re

model_name = "ielabgroup/Autobool-Qwen4b-Reasoning"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name)

# Define your systematic review topic
topic = "Diagnostic accuracy of endoscopic ultrasonography (EUS) for the preoperative locoregional staging of primary gastric cancer"

# Construct the prompt with system and user messages
messages = [
    {"role": "system", "content": "You are an expert systematic review information specialist.\nYou are tasked to formulate a systematic review Boolean query in response to a research topic.\nYour reasoning process should be enclosed within <think></think>, and the final Boolean query must be enclosed within <answer></answer> tags. Do not include anything outside of these tags."},
    {"role": "user", "content": f'You are given a systematic review research topic, with the topic title "{topic}".\nYour task is to generate a highly effective Boolean query in MEDLINE format for PubMed.\nThe query should balance **high recall** (capturing all relevant studies) with **reasonable precision** (avoiding irrelevant results):\n- Use both free-text terms and MeSH terms (e.g., chronic pain[tiab], Pain[mh]).\n- **Do not wrap terms or phrases in double quotes**, as this disables automatic term mapping (ATM).\n- Combine synonyms or related terms within a concept using OR.\n- Combine different concepts using AND.\n- Use wildcards (*) to capture word variants (e.g., vaccin* → vaccine, vaccination):\n  - Terms must have ≥4 characters before the * (e.g., colo*)\n  - Wildcards work with field tags (e.g., breastfeed*[tiab]).\n- Field tags limit the search to specific fields and disable ATM.\n- Do not include date limits.\n- Tag terms using appropriate fields (e.g., covid-19[ti] vaccine[ti] children[ti]) when needed.\n**Only use the following allowed field tags:**\nTitle: [ti], Abstract: [ab], Title/Abstract: [tiab]\nMeSH: [mh], Major MeSH: [majr], Supplementary Concept: [nm]\nText Words: [tw], All Fields: [all]\nPublication Type: [pt], Language: [la]\n\nOutput your full reasoning inside <think></think>.\nOutput the final Boolean query inside <answer></answer>.\nDo not include any content outside these tags.'}
]

# Generate the query
prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(**inputs, max_length=4096)
response = tokenizer.decode(outputs[0], skip_special_tokens=True)

# Extract reasoning and query
reasoning_match = re.search(r'<think>(.*?)</think>', response, re.DOTALL)
query_match = re.search(r'<answer>(.*?)</answer>', response, re.DOTALL)

if reasoning_match and query_match:
    reasoning = reasoning_match.group(1).strip()
    query = query_match.group(1).strip()
    print("Reasoning:", reasoning)
    print("\nQuery:", query)

The model will generate output with reasoning:

<think>
[Detailed step-by-step reasoning explaining the query construction process,
including term selection, MeSH terms, field tags, wildcards, and Boolean logic]
</think>
<answer>
[Final Boolean query]
</answer>

Advantages

Provides interpretable reasoning process
Can help understand query construction decisions
May improve query quality through structured thinking

Limitations

Optimized specifically for PubMed Boolean query syntax
Performance may vary on non-biomedical domains
Requires domain knowledge for effective prompt engineering

Citation

If you use this model, please cite:

@inproceedings{autobool2026,
  title={AutoBool: Reinforcement Learning for Boolean Query Generation in Systematic Reviews},
  author={Shuai Wang, Harrisen Scells, Bevan Koopman, Guido Zuccon},
  booktitle={Proceedings of the 2026 Conference of the European Chapter of the Association for Computational Linguistics (EACL)},
  year={2026}
}