abi-commits commited on
Commit
409c41d
·
verified ·
1 Parent(s): ecf2892

Updated readme

Browse files
Files changed (1) hide show
  1. README.md +90 -3
README.md CHANGED
@@ -1,3 +1,90 @@
1
- ---
2
- license: mit
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ base_model: Qwen/Qwen2.5-1.5B-Instruct
4
+ tags:
5
+ - qwen2.5
6
+ - fine-tuned
7
+ - qlora
8
+ - query-optimization
9
+ - enterprise-search
10
+ - text2text-generation
11
+ language:
12
+ - en
13
+ pipeline_tag: text-generation
14
+ ---
15
+
16
+ # Qwen2.5-1.5B Query Optimizer
17
+
18
+ A fine-tuned version of [Qwen/Qwen2.5-1.5B-Instruct](https://huggingface.co/Qwen/Qwen2.5-1.5B-Instruct) trained to rewrite loose, conversational user queries into clear, retrieval-focused enterprise document search queries.
19
+
20
+ ## Model Details
21
+
22
+ | Property | Value |
23
+ |---|---|
24
+ | **Base model** | Qwen/Qwen2.5-1.5B-Instruct |
25
+ | **Fine-tuning method** | QLoRA (4-bit NF4 + LoRA) |
26
+ | **LoRA rank** | 16 |
27
+ | **LoRA alpha** | 32 |
28
+ | **Target modules** | q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj |
29
+ | **Training examples** | 481 (90% of 535 total) |
30
+ | **Eval examples** | 54 (10% of 535 total) |
31
+ | **Training epochs** | 3 |
32
+ | **Effective batch size** | 16 (4 × 4 gradient accumulation) |
33
+ | **Learning rate** | 2e-4 (cosine schedule) |
34
+ | **Max sequence length** | 256 |
35
+
36
+ ## Intended Use
37
+
38
+ This model is designed for **enterprise AI search pipelines** where raw user queries need to be normalized before being passed to a retrieval system (e.g., vector search, BM25, or hybrid search).
39
+
40
+ **Input**: A natural, conversational user query
41
+ **Output**: A concise, retrieval-optimized search query
42
+
43
+ ### Example
44
+
45
+ ```python
46
+ from transformers import AutoTokenizer, AutoModelForCausalLM
47
+ import torch
48
+
49
+ model_id = "abi-commits/qwen-query-optimizer"
50
+ tokenizer = AutoTokenizer.from_pretrained(model_id, trust_remote_code=True)
51
+ model = AutoModelForCausalLM.from_pretrained(
52
+ model_id,
53
+ device_map="auto",
54
+ torch_dtype=torch.float16,
55
+ trust_remote_code=True,
56
+ )
57
+
58
+ SYSTEM_PROMPT = (
59
+ "You are a query optimization agent. Rewrite user queries into clear, "
60
+ "retrieval-focused enterprise document search queries. "
61
+ "Do not add new information. Do not hallucinate."
62
+ )
63
+
64
+ def optimize_query(user_query: str) -> str:
65
+ messages = [
66
+ {"role": "system", "content": SYSTEM_PROMPT},
67
+ {"role": "user", "content": user_query},
68
+ ]
69
+ prompt = tokenizer.apply_chat_template(
70
+ messages, tokenize=False, add_generation_prompt=True
71
+ )
72
+ inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
73
+ with torch.no_grad():
74
+ output_ids = model.generate(
75
+ **inputs,
76
+ max_new_tokens=80,
77
+ do_sample=False,
78
+ repetition_penalty=1.1,
79
+ eos_token_id=tokenizer.eos_token_id,
80
+ pad_token_id=tokenizer.pad_token_id,
81
+ )
82
+ generated = output_ids[0][inputs["input_ids"].shape[1]:]
83
+ return tokenizer.decode(generated, skip_special_tokens=True).strip()
84
+
85
+ # Examples
86
+ print(optimize_query("how do i request time off?"))
87
+ # → "employee leave request procedure and time-off policy"
88
+
89
+ print(optimize_query("what's the refund policy?"))
90
+ # → "refund policy terms and conditions for customer returns"