aravula7 commited on
Commit
1c1f833
·
verified ·
1 Parent(s): 4adf762

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +148 -0
README.md ADDED
@@ -0,0 +1,148 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language: en
3
+ pipeline_tag: text-generation
4
+ library_name: transformers
5
+ tags:
6
+ - text-to-sql
7
+ - sql
8
+ - postgresql
9
+ - qwen2.5
10
+ - qlora
11
+ - peft
12
+ - quantization
13
+ base_model: Qwen/Qwen2.5-3B-Instruct
14
+ license: other
15
+ ---
16
+
17
+ # Qwen2.5-3B Text-to-SQL (PostgreSQL) — Fine-Tuned
18
+
19
+ ## Overview
20
+
21
+ This repository contains a fine-tuned **Qwen/Qwen2.5-3B-Instruct** model specialized for **Text-to-SQL** generation in **PostgreSQL** for a realistic e-commerce + subscriptions analytics schema.
22
+
23
+ Artifacts are organized under a single Hub repo using subfolders:
24
+
25
+ - `fp16/` — merged FP16 model (recommended)
26
+ - `int8/` — quantized INT8 checkpoint (smaller footprint)
27
+ - `lora_adapter/` — LoRA adapter only (for further tuning / research)
28
+
29
+ ## Intended use
30
+
31
+ **Use cases**
32
+
33
+ - Convert natural language questions into PostgreSQL queries.
34
+ - Analytical queries over common e-commerce tables (customers, orders, products, subscriptions) plus ML prediction tables (churn/forecast).
35
+
36
+ **Not for**
37
+
38
+ - Direct execution on sensitive or production databases without validation (schema checks, allow-lists, sandbox execution).
39
+ - Security-critical contexts (SQL injection prevention and access control must be handled outside the model).
40
+
41
+ ## Training summary
42
+
43
+ | Item | Value |
44
+ |---|---|
45
+ | Base model | Qwen/Qwen2.5-3B-Instruct |
46
+ | Fine-tuning method | QLoRA (4-bit) |
47
+ | Optimizer | paged_adamw_8bit |
48
+ | Epochs | 4 |
49
+ | Decoding | Greedy |
50
+ | Tracking | MLflow (DagsHub) |
51
+
52
+ ## Evaluation summary (100 test examples)
53
+
54
+ Primary metric: **parseable PostgreSQL SQL** (validated with `sqlglot`).
55
+ Secondary metric: **exact match** (strict string match vs. reference SQL).
56
+
57
+ | Model | Parseable SQL | Exact match | Mean latency (s) | P50 (s) | P95 (s) |
58
+ |---|---:|---:|---:|---:|---:|
59
+ | qwen_baseline_fp16 | 1.00 | 0.09 | 0.405 | 0.422 | 0.624 |
60
+ | qwen_finetuned_fp16 | 0.93 | 0.13 | 0.527 | 0.711 | 0.739 |
61
+ | qwen_finetuned_int8 | 0.93 | 0.13 | 2.672 | 3.454 | 3.623 |
62
+ | qwen_finetuned_fp16_strict | 1.00 | 0.15 | 0.433 | 0.427 | 0.736 |
63
+ | qwen_finetuned_int8_strict | 0.99 | 0.20 | 2.152 | 2.541 | 3.610 |
64
+ | gpt-4o-mini | 1.00 | 0.04 | 1.616 | 1.551 | 2.820 |
65
+ | claude-3.5-haiku | 0.99 | 0.07 | 1.735 | 1.541 | 2.697 |
66
+
67
+ Notes:
68
+ - The “strict” variants used a stricter system instruction to return **SQL only** (no prose, no markdown), which improved reliability.
69
+ - INT8 reduced memory usage but was slower in this specific GPU evaluation setup.
70
+
71
+ ## How to load
72
+
73
+ ### Load the merged FP16 model (recommended)
74
+
75
+ ```python
76
+ from transformers import AutoModelForCausalLM, AutoTokenizer
77
+
78
+ repo_id = "aravula7/qwen-sql-finetuning"
79
+
80
+ tokenizer = AutoTokenizer.from_pretrained(repo_id, subfolder="fp16")
81
+ model = AutoModelForCausalLM.from_pretrained(repo_id, subfolder="fp16")
82
+ ```
83
+
84
+ ### Load the INT8 model
85
+
86
+ ```python
87
+ from transformers import AutoModelForCausalLM, AutoTokenizer
88
+
89
+ repo_id = "aravula7/qwen-sql-finetuning"
90
+
91
+ tokenizer = AutoTokenizer.from_pretrained(repo_id, subfolder="int8")
92
+ model = AutoModelForCausalLM.from_pretrained(repo_id, subfolder="int8")
93
+ ```
94
+
95
+ ### Load base model + LoRA adapter
96
+
97
+ ```python
98
+ from transformers import AutoModelForCausalLM, AutoTokenizer
99
+ from peft import PeftModel
100
+
101
+ base_id = "Qwen/Qwen2.5-3B-Instruct"
102
+ repo_id = "aravula7/qwen-sql-finetuning"
103
+
104
+ tokenizer = AutoTokenizer.from_pretrained(base_id)
105
+ base = AutoModelForCausalLM.from_pretrained(base_id)
106
+
107
+ model = PeftModel.from_pretrained(base, repo_id, subfolder="lora_adapter")
108
+ ```
109
+
110
+ ## Example inference
111
+
112
+ Below is a minimal example that encourages **SQL-only** output.
113
+
114
+ ```python
115
+ import torch
116
+ from transformers import AutoModelForCausalLM, AutoTokenizer
117
+
118
+ repo_id = "aravula7/qwen-sql-finetuning"
119
+ tokenizer = AutoTokenizer.from_pretrained(repo_id, subfolder="fp16")
120
+ model = AutoModelForCausalLM.from_pretrained(repo_id, subfolder="fp16")
121
+
122
+ system = "Return ONLY the PostgreSQL query. Do NOT include explanations, markdown, code fences, or commentary."
123
+ schema = "Table: customers (customer_id, email, state)\nTable: orders (order_id, customer_id, order_timestamp)"
124
+ request = "Show the number of orders per customer in 2025."
125
+
126
+ prompt = f"""{system}
127
+
128
+ Schema:
129
+ {schema}
130
+
131
+ Request:
132
+ {request}
133
+ """
134
+
135
+ inputs = tokenizer(prompt, return_tensors="pt")
136
+ with torch.no_grad():
137
+ out = model.generate(**inputs, max_new_tokens=256, do_sample=False)
138
+
139
+ print(tokenizer.decode(out[0], skip_special_tokens=True))
140
+ ```
141
+
142
+ ## License
143
+
144
+ This repository is a fine-tuned derivative of the base model listed in the metadata. Please follow the licensing terms of the base model and any dataset constraints used for training.
145
+
146
+ ## Reproducibility
147
+
148
+ Training and evaluation were tracked with MLflow on DagsHub. The associated GitHub/DagsHub repository contains the notebook, data splits, and logged runs.