Qwen2.5-1.5B Query Optimizer

A fine-tuned version of Qwen/Qwen2.5-1.5B-Instruct trained to rewrite loose, conversational user queries into clear, retrieval-focused enterprise document search queries.

Model Details

Property Value
Base model Qwen/Qwen2.5-1.5B-Instruct
Fine-tuning method QLoRA (4-bit NF4 + LoRA)
LoRA rank 16
LoRA alpha 32
Target modules q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
Training examples 481 (90% of 535 total)
Eval examples 54 (10% of 535 total)
Training epochs 3
Effective batch size 16 (4 ร— 4 gradient accumulation)
Learning rate 2e-4 (cosine schedule)
Max sequence length 256

Intended Use

This model is designed for enterprise AI search pipelines where raw user queries need to be normalized before being passed to a retrieval system (e.g., vector search, BM25, or hybrid search).

Input: A natural, conversational user query
Output: A concise, retrieval-optimized search query

Example

from transformers import AutoTokenizer, AutoModelForCausalLM
import torch

model_id = "abi-commits/qwen-query-optimizer"
tokenizer = AutoTokenizer.from_pretrained(model_id, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(
    model_id,
    device_map="auto",
    torch_dtype=torch.float16,
    trust_remote_code=True,
)

SYSTEM_PROMPT = (
    "You are a query optimization agent. Rewrite user queries into clear, "
    "retrieval-focused enterprise document search queries. "
    "Do not add new information. Do not hallucinate."
)

def optimize_query(user_query: str) -> str:
    messages = [
        {"role": "system", "content": SYSTEM_PROMPT},
        {"role": "user",   "content": user_query},
    ]
    prompt = tokenizer.apply_chat_template(
        messages, tokenize=False, add_generation_prompt=True
    )
    inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
    with torch.no_grad():
        output_ids = model.generate(
            **inputs,
            max_new_tokens=80,
            do_sample=False,
            repetition_penalty=1.1,
            eos_token_id=tokenizer.eos_token_id,
            pad_token_id=tokenizer.pad_token_id,
        )
    generated = output_ids[0][inputs["input_ids"].shape[1]:]
    return tokenizer.decode(generated, skip_special_tokens=True).strip()

# Examples
print(optimize_query("how do i request time off?"))
# โ†’ "employee leave request procedure and time-off policy"

print(optimize_query("what's the refund policy?"))
# โ†’ "refund policy terms and conditions for customer returns"
Downloads last month
18
Safetensors
Model size
2B params
Tensor type
F16
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for abi-commits/qwen-query-optimizer

Base model

Qwen/Qwen2.5-1.5B
Finetuned
(1405)
this model