Qwen2.5-1.5B Query Optimizer
A fine-tuned version of Qwen/Qwen2.5-1.5B-Instruct trained to rewrite loose, conversational user queries into clear, retrieval-focused enterprise document search queries.
Model Details
| Property | Value |
|---|---|
| Base model | Qwen/Qwen2.5-1.5B-Instruct |
| Fine-tuning method | QLoRA (4-bit NF4 + LoRA) |
| LoRA rank | 16 |
| LoRA alpha | 32 |
| Target modules | q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj |
| Training examples | 481 (90% of 535 total) |
| Eval examples | 54 (10% of 535 total) |
| Training epochs | 3 |
| Effective batch size | 16 (4 ร 4 gradient accumulation) |
| Learning rate | 2e-4 (cosine schedule) |
| Max sequence length | 256 |
Intended Use
This model is designed for enterprise AI search pipelines where raw user queries need to be normalized before being passed to a retrieval system (e.g., vector search, BM25, or hybrid search).
Input: A natural, conversational user query
Output: A concise, retrieval-optimized search query
Example
from transformers import AutoTokenizer, AutoModelForCausalLM
import torch
model_id = "abi-commits/qwen-query-optimizer"
tokenizer = AutoTokenizer.from_pretrained(model_id, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(
model_id,
device_map="auto",
torch_dtype=torch.float16,
trust_remote_code=True,
)
SYSTEM_PROMPT = (
"You are a query optimization agent. Rewrite user queries into clear, "
"retrieval-focused enterprise document search queries. "
"Do not add new information. Do not hallucinate."
)
def optimize_query(user_query: str) -> str:
messages = [
{"role": "system", "content": SYSTEM_PROMPT},
{"role": "user", "content": user_query},
]
prompt = tokenizer.apply_chat_template(
messages, tokenize=False, add_generation_prompt=True
)
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
with torch.no_grad():
output_ids = model.generate(
**inputs,
max_new_tokens=80,
do_sample=False,
repetition_penalty=1.1,
eos_token_id=tokenizer.eos_token_id,
pad_token_id=tokenizer.pad_token_id,
)
generated = output_ids[0][inputs["input_ids"].shape[1]:]
return tokenizer.decode(generated, skip_special_tokens=True).strip()
# Examples
print(optimize_query("how do i request time off?"))
# โ "employee leave request procedure and time-off policy"
print(optimize_query("what's the refund policy?"))
# โ "refund policy terms and conditions for customer returns"
- Downloads last month
- 18