rajaykumar12959 commited on
Commit
331e226
·
verified ·
1 Parent(s): 6bcb1df

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +276 -2
README.md CHANGED
@@ -5,17 +5,291 @@ tags:
5
  - transformers
6
  - unsloth
7
  - gemma2
 
 
 
8
  license: apache-2.0
9
  language:
10
  - en
 
 
 
11
  ---
12
 
13
- # Uploaded finetuned model
14
 
15
  - **Developed by:** rajaykumar12959
16
  - **License:** apache-2.0
17
- - **Finetuned from model :** unsloth/gemma-2-2b-it-bnb-4bit
 
 
 
18
 
19
  This gemma2 model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.
20
 
21
  [<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
5
  - transformers
6
  - unsloth
7
  - gemma2
8
+ - text-to-sql
9
+ - qlora
10
+ - sql-generation
11
  license: apache-2.0
12
  language:
13
  - en
14
+ datasets:
15
+ - gretelai/synthetic_text_to_sql
16
+ pipeline_tag: text-generation
17
  ---
18
 
19
+ # Gemma-2-2B Text-to-SQL QLoRA Fine-tuned Model
20
 
21
  - **Developed by:** rajaykumar12959
22
  - **License:** apache-2.0
23
+ - **Finetuned from model:** unsloth/gemma-2-2b-it-bnb-4bit
24
+ - **Dataset:** gretelai/synthetic_text_to_sql
25
+ - **Task:** Text-to-SQL Generation
26
+ - **Fine-tuning Method:** QLoRA (Quantized Low-Rank Adaptation)
27
 
28
  This gemma2 model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.
29
 
30
  [<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)
31
+
32
+ ## Model Description
33
+
34
+ This model is specifically fine-tuned to generate SQL queries from natural language questions and database schemas. It excels at handling complex multi-table queries requiring JOINs, aggregations, filtering, and advanced SQL operations.
35
+
36
+ ### Key Features
37
+
38
+ - ✅ **Multi-table JOINs** (INNER, LEFT, RIGHT)
39
+ - ✅ **Aggregation functions** (SUM, COUNT, AVG, MIN, MAX)
40
+ - ✅ **GROUP BY and HAVING clauses**
41
+ - ✅ **Complex WHERE conditions**
42
+ - ✅ **Subqueries and CTEs**
43
+ - ✅ **Date/time operations**
44
+ - ✅ **String functions and pattern matching**
45
+
46
+ ## Training Configuration
47
+
48
+ The model was fine-tuned using QLoRA with the following configuration:
49
+
50
+ ```python
51
+ # LoRA Configuration
52
+ r = 16 # Rank: 16 is a good balance for 2B models
53
+ target_modules = ["q_proj", "k_proj", "v_proj", "o_proj", "gate_proj", "up_proj", "down_proj"]
54
+ lora_alpha = 16
55
+ lora_dropout = 0
56
+ bias = "none"
57
+ use_gradient_checkpointing = "unsloth"
58
+
59
+ # Training Parameters
60
+ max_seq_length = 2048
61
+ per_device_train_batch_size = 2
62
+ gradient_accumulation_steps = 4 # Effective batch size = 8
63
+ warmup_steps = 5
64
+ max_steps = 100 # Demo configuration - increase to 300+ for production
65
+ learning_rate = 2e-4
66
+ optim = "adamw_8bit" # 8-bit optimizer for memory efficiency
67
+ weight_decay = 0.01
68
+ lr_scheduler_type = "linear"
69
+ ```
70
+
71
+ ## Installation
72
+
73
+ ```bash
74
+ pip install unsloth transformers torch trl datasets
75
+ ```
76
+
77
+ ## Usage
78
+
79
+ ### Loading the Model
80
+
81
+ ```python
82
+ from unsloth import FastLanguageModel
83
+ import torch
84
+
85
+ max_seq_length = 2048
86
+ dtype = None
87
+ load_in_4bit = True
88
+
89
+ model, tokenizer = FastLanguageModel.from_pretrained(
90
+ model_name = "rajaykumar12959/gemma-2-2b-text-to-sql-qlora",
91
+ max_seq_length = max_seq_length,
92
+ dtype = dtype,
93
+ load_in_4bit = load_in_4bit,
94
+ )
95
+
96
+ FastLanguageModel.for_inference(model) # Enable faster inference
97
+ ```
98
+
99
+ ### Generating SQL Queries
100
+
101
+ ```python
102
+ def generate_sql(schema, question):
103
+ gemma_prompt = """<start_of_turn>user
104
+ You are a powerful text-to-SQL model. Your job is to answer questions about a database. You are given a question and context regarding one or more tables.
105
+
106
+ ### Schema:
107
+ {}
108
+
109
+ ### Question:
110
+ {}<end_of_turn>
111
+ <start_of_turn>model
112
+ """
113
+
114
+ input_prompt = gemma_prompt.format(schema, question)
115
+ inputs = tokenizer([input_prompt], return_tensors="pt").to("cuda")
116
+
117
+ outputs = model.generate(**inputs, max_new_tokens=300, use_cache=True)
118
+ result = tokenizer.batch_decode(outputs)[0]
119
+
120
+ # Extract the generated SQL
121
+ sql_result = result.split("<start_of_turn>model")[-1].replace("<end_of_turn>", "").strip()
122
+ return sql_result
123
+ ```
124
+
125
+ ### Example: Complex Multi-Table Query
126
+
127
+ ```python
128
+ # E-commerce Database Schema
129
+ test_sql_context = """
130
+ CREATE TABLE users (
131
+ user_id INT PRIMARY KEY,
132
+ username TEXT,
133
+ email TEXT
134
+ );
135
+
136
+ CREATE TABLE orders (
137
+ order_id INT PRIMARY KEY,
138
+ user_id INT,
139
+ order_date DATE,
140
+ FOREIGN KEY (user_id) REFERENCES users(user_id)
141
+ );
142
+
143
+ CREATE TABLE products (
144
+ product_id INT PRIMARY KEY,
145
+ product_name TEXT,
146
+ category TEXT,
147
+ price DECIMAL
148
+ );
149
+
150
+ CREATE TABLE order_items (
151
+ item_id INT PRIMARY KEY,
152
+ order_id INT,
153
+ product_id INT,
154
+ quantity INT,
155
+ FOREIGN KEY (order_id) REFERENCES orders(order_id),
156
+ FOREIGN KEY (product_id) REFERENCES products(product_id)
157
+ );
158
+ """
159
+
160
+ # Complex Question
161
+ test_question = """
162
+ List the usernames and emails of users who have spent more than $500 in total on products
163
+ in the 'Electronics' category.
164
+ """
165
+
166
+ # Generate SQL
167
+ sql_query = generate_sql(test_sql_context, test_question)
168
+ print(sql_query)
169
+ ```
170
+
171
+ **Expected Output:**
172
+ ```sql
173
+ SELECT u.username, u.email
174
+ FROM users u
175
+ JOIN orders o ON u.user_id = o.user_id
176
+ JOIN order_items oi ON o.order_id = oi.order_id
177
+ JOIN products p ON oi.product_id = p.product_id
178
+ WHERE p.category = 'Electronics'
179
+ GROUP BY u.user_id, u.username, u.email
180
+ HAVING SUM(oi.quantity * p.price) > 500;
181
+ ```
182
+
183
+ ## Training Details
184
+
185
+ ### Dataset
186
+ - **Source:** gretelai/synthetic_text_to_sql
187
+ - **Size:** 100,000 synthetic text-to-SQL examples
188
+ - **Columns used:**
189
+ - `sql_context`: Database schema
190
+ - `sql_prompt`: Natural language question
191
+ - `sql`: Target SQL query
192
+
193
+ ### Training Process
194
+ The model uses a custom formatting function to structure the training data:
195
+
196
+ ```python
197
+ def formatting_prompts_func(examples):
198
+ schemas = examples["sql_context"]
199
+ questions = examples["sql_prompt"]
200
+ outputs = examples["sql"]
201
+
202
+ texts = []
203
+ for schema, question, output in zip(schemas, questions, outputs):
204
+ text = gemma_prompt.format(schema, question, output) + EOS_TOKEN
205
+ texts.append(text)
206
+ return { "text" : texts, }
207
+ ```
208
+
209
+ ### Hardware Requirements
210
+ - **GPU:** Single GPU with 8GB+ VRAM
211
+ - **Training Time:** ~30 minutes for 100 steps
212
+ - **Memory Optimization:** 4-bit quantization + 8-bit optimizer
213
+
214
+ ## Performance Characteristics
215
+
216
+ ### Strengths
217
+ - Excellent performance on multi-table JOINs
218
+ - Accurate aggregation and GROUP BY operations
219
+ - Proper handling of foreign key relationships
220
+ - Good understanding of filtering logic (WHERE/HAVING)
221
+
222
+ ### Model Capabilities Test
223
+ The model was tested on a complex 4-table JOIN query requiring:
224
+ 1. **Multi-table JOINs** (users → orders → order_items → products)
225
+ 2. **Category filtering** (WHERE p.category = 'Electronics')
226
+ 3. **User grouping** (GROUP BY user fields)
227
+ 4. **Aggregation** (SUM of price × quantity)
228
+ 5. **Aggregate filtering** (HAVING total > 500)
229
+
230
+ ## Limitations
231
+
232
+ - **Training Scale:** Trained with only 100 steps for demonstration. For production use, increase `max_steps` to 300+
233
+ - **Context Length:** Limited to 2048 tokens maximum sequence length
234
+ - **SQL Dialects:** Primarily trained on standard SQL syntax
235
+ - **Complex Subqueries:** May require additional fine-tuning for highly complex nested queries
236
+
237
+ ## Reproduction
238
+
239
+ To reproduce this training:
240
+
241
+ 1. **Clone the notebook:** Use the provided `Fine_tune_qlora.ipynb`
242
+ 2. **Install dependencies:**
243
+ ```bash
244
+ pip install unsloth transformers torch trl datasets
245
+ ```
246
+ 3. **Configure training:** Adjust `max_steps` in TrainingArguments for longer training
247
+ 4. **Run training:** Execute all cells in the notebook
248
+
249
+ ### Production Training Recommendations
250
+ ```python
251
+ # For production use, update these parameters:
252
+ max_steps = 300, # Increase from 100
253
+ warmup_steps = 10, # Increase warmup
254
+ per_device_train_batch_size = 4, # If you have more GPU memory
255
+ ```
256
+
257
+ ## Model Card
258
+
259
+ | Parameter | Value |
260
+ |-----------|--------|
261
+ | Base Model | Gemma-2-2B (4-bit quantized) |
262
+ | Fine-tuning Method | QLoRA |
263
+ | LoRA Rank | 16 |
264
+ | Training Steps | 100 (demo) |
265
+ | Learning Rate | 2e-4 |
266
+ | Batch Size | 8 (effective) |
267
+ | Max Sequence Length | 2048 |
268
+ | Dataset Size | 100k examples |
269
+
270
+ ## Citation
271
+
272
+ ```bibtex
273
+ @misc{gemma-2-2b-text-to-sql-qlora,
274
+ author = {rajaykumar12959},
275
+ title = {Gemma-2-2B Text-to-SQL QLoRA Fine-tuned Model},
276
+ year = {2024},
277
+ publisher = {Hugging Face},
278
+ howpublished = {\url{https://huggingface.co/rajaykumar12959/gemma-2-2b-text-to-sql-qlora}},
279
+ }
280
+ ```
281
+
282
+ ## Acknowledgments
283
+
284
+ - **Base Model:** Google's Gemma-2-2B via Unsloth optimization
285
+ - **Dataset:** Gretel AI's synthetic text-to-SQL dataset
286
+ - **Framework:** Unsloth for efficient fine-tuning and TRL for training
287
+ - **Method:** QLoRA for parameter-efficient training
288
+
289
+ ## License
290
+
291
+ This model is licensed under Apache 2.0. See the LICENSE file for details.
292
+
293
+ ---
294
+
295
+ *This model is intended for research and educational purposes. Please ensure compliance with your organization's data and AI usage policies when using in production environments.*