rajaykumar12959
/

gemma-2-2b-sql-finetuned

@@ -5,17 +5,291 @@ tags:
 - transformers
 - unsloth
 - gemma2
 license: apache-2.0
 language:
 - en
 ---
-# Uploaded finetuned  model
 - **Developed by:** rajaykumar12959
 - **License:** apache-2.0
-- **Finetuned from model :** unsloth/gemma-2-2b-it-bnb-4bit
 This gemma2 model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.
 [<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)

 - transformers
 - unsloth
 - gemma2
+- text-to-sql
+- qlora
+- sql-generation
 license: apache-2.0
 language:
 - en
+datasets:
+- gretelai/synthetic_text_to_sql
+pipeline_tag: text-generation
 ---
+# Gemma-2-2B Text-to-SQL QLoRA Fine-tuned Model
 - **Developed by:** rajaykumar12959
 - **License:** apache-2.0
+- **Finetuned from model:** unsloth/gemma-2-2b-it-bnb-4bit
+- **Dataset:** gretelai/synthetic_text_to_sql
+- **Task:** Text-to-SQL Generation
+- **Fine-tuning Method:** QLoRA (Quantized Low-Rank Adaptation)
 This gemma2 model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.
 [<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)
+## Model Description
+This model is specifically fine-tuned to generate SQL queries from natural language questions and database schemas. It excels at handling complex multi-table queries requiring JOINs, aggregations, filtering, and advanced SQL operations.
+### Key Features
+- ✅ **Multi-table JOINs** (INNER, LEFT, RIGHT)
+- ✅ **Aggregation functions** (SUM, COUNT, AVG, MIN, MAX)
+- ✅ **GROUP BY and HAVING clauses**
+- ✅ **Complex WHERE conditions**
+- ✅ **Subqueries and CTEs**
+- ✅ **Date/time operations**
+- ✅ **String functions and pattern matching**
+## Training Configuration
+The model was fine-tuned using QLoRA with the following configuration:
+```python
+# LoRA Configuration
+r = 16  # Rank: 16 is a good balance for 2B models
+target_modules = ["q_proj", "k_proj", "v_proj", "o_proj", "gate_proj", "up_proj", "down_proj"]
+lora_alpha = 16
+lora_dropout = 0
+bias = "none"
+use_gradient_checkpointing = "unsloth"
+# Training Parameters
+max_seq_length = 2048
+per_device_train_batch_size = 2
+gradient_accumulation_steps = 4  # Effective batch size = 8
+warmup_steps = 5
+max_steps = 100  # Demo configuration - increase to 300+ for production
+learning_rate = 2e-4
+optim = "adamw_8bit"  # 8-bit optimizer for memory efficiency
+weight_decay = 0.01
+lr_scheduler_type = "linear"
+```
+## Installation
+```bash
+pip install unsloth transformers torch trl datasets
+```
+## Usage
+### Loading the Model
+```python
+from unsloth import FastLanguageModel
+import torch
+max_seq_length = 2048
+dtype = None
+load_in_4bit = True
+model, tokenizer = FastLanguageModel.from_pretrained(
+    model_name = "rajaykumar12959/gemma-2-2b-text-to-sql-qlora",
+    max_seq_length = max_seq_length,
+    dtype = dtype,
+    load_in_4bit = load_in_4bit,
+)
+FastLanguageModel.for_inference(model)  # Enable faster inference
+```
+### Generating SQL Queries
+```python
+def generate_sql(schema, question):
+    gemma_prompt = """<start_of_turn>user
+You are a powerful text-to-SQL model. Your job is to answer questions about a database. You are given a question and context regarding one or more tables.
+### Schema:
+{}
+### Question:
+{}<end_of_turn>
+<start_of_turn>model
+"""
+    input_prompt = gemma_prompt.format(schema, question)
+    inputs = tokenizer([input_prompt], return_tensors="pt").to("cuda")
+    outputs = model.generate(**inputs, max_new_tokens=300, use_cache=True)
+    result = tokenizer.batch_decode(outputs)[0]
+    # Extract the generated SQL
+    sql_result = result.split("<start_of_turn>model")[-1].replace("<end_of_turn>", "").strip()
+    return sql_result
+```
+### Example: Complex Multi-Table Query
+```python
+# E-commerce Database Schema
+test_sql_context = """
+CREATE TABLE users (
+    user_id INT PRIMARY KEY,
+    username TEXT,
+    email TEXT
+);
+CREATE TABLE orders (
+    order_id INT PRIMARY KEY,
+    user_id INT,
+    order_date DATE,
+    FOREIGN KEY (user_id) REFERENCES users(user_id)
+);
+CREATE TABLE products (
+    product_id INT PRIMARY KEY,
+    product_name TEXT,
+    category TEXT,
+    price DECIMAL
+);
+CREATE TABLE order_items (
+    item_id INT PRIMARY KEY,
+    order_id INT,
+    product_id INT,
+    quantity INT,
+    FOREIGN KEY (order_id) REFERENCES orders(order_id),
+    FOREIGN KEY (product_id) REFERENCES products(product_id)
+);
+"""
+# Complex Question
+test_question = """
+List the usernames and emails of users who have spent more than $500 in total on products
+in the 'Electronics' category.
+"""
+# Generate SQL
+sql_query = generate_sql(test_sql_context, test_question)
+print(sql_query)
+```
+**Expected Output:**
+```sql
+SELECT u.username, u.email
+FROM users u
+JOIN orders o ON u.user_id = o.user_id
+JOIN order_items oi ON o.order_id = oi.order_id
+JOIN products p ON oi.product_id = p.product_id
+WHERE p.category = 'Electronics'
+GROUP BY u.user_id, u.username, u.email
+HAVING SUM(oi.quantity * p.price) > 500;
+```
+## Training Details
+### Dataset
+- **Source:** gretelai/synthetic_text_to_sql
+- **Size:** 100,000 synthetic text-to-SQL examples
+- **Columns used:**
+  - `sql_context`: Database schema
+  - `sql_prompt`: Natural language question
+  - `sql`: Target SQL query
+### Training Process
+The model uses a custom formatting function to structure the training data:
+```python
+def formatting_prompts_func(examples):
+    schemas   = examples["sql_context"]
+    questions = examples["sql_prompt"]
+    outputs   = examples["sql"]
+    texts = []
+    for schema, question, output in zip(schemas, questions, outputs):
+        text = gemma_prompt.format(schema, question, output) + EOS_TOKEN
+        texts.append(text)
+    return { "text" : texts, }
+```
+### Hardware Requirements
+- **GPU:** Single GPU with 8GB+ VRAM
+- **Training Time:** ~30 minutes for 100 steps
+- **Memory Optimization:** 4-bit quantization + 8-bit optimizer
+## Performance Characteristics
+### Strengths
+- Excellent performance on multi-table JOINs
+- Accurate aggregation and GROUP BY operations
+- Proper handling of foreign key relationships
+- Good understanding of filtering logic (WHERE/HAVING)
+### Model Capabilities Test
+The model was tested on a complex 4-table JOIN query requiring:
+1. **Multi-table JOINs** (users → orders → order_items → products)
+2. **Category filtering** (WHERE p.category = 'Electronics')
+3. **User grouping** (GROUP BY user fields)
+4. **Aggregation** (SUM of price × quantity)
+5. **Aggregate filtering** (HAVING total > 500)
+## Limitations
+- **Training Scale:** Trained with only 100 steps for demonstration. For production use, increase `max_steps` to 300+
+- **Context Length:** Limited to 2048 tokens maximum sequence length
+- **SQL Dialects:** Primarily trained on standard SQL syntax
+- **Complex Subqueries:** May require additional fine-tuning for highly complex nested queries
+## Reproduction
+To reproduce this training:
+1. **Clone the notebook:** Use the provided `Fine_tune_qlora.ipynb`
+2. **Install dependencies:**
+   ```bash
+   pip install unsloth transformers torch trl datasets
+   ```
+3. **Configure training:** Adjust `max_steps` in TrainingArguments for longer training
+4. **Run training:** Execute all cells in the notebook
+### Production Training Recommendations
+```python
+# For production use, update these parameters:
+max_steps = 300,  # Increase from 100
+warmup_steps = 10,  # Increase warmup
+per_device_train_batch_size = 4,  # If you have more GPU memory
+```
+## Model Card
+| Parameter | Value |
+|-----------|--------|
+| Base Model | Gemma-2-2B (4-bit quantized) |
+| Fine-tuning Method | QLoRA |
+| LoRA Rank | 16 |
+| Training Steps | 100 (demo) |
+| Learning Rate | 2e-4 |
+| Batch Size | 8 (effective) |
+| Max Sequence Length | 2048 |
+| Dataset Size | 100k examples |
+## Citation
+```bibtex
+@misc{gemma-2-2b-text-to-sql-qlora,
+  author = {rajaykumar12959},
+  title = {Gemma-2-2B Text-to-SQL QLoRA Fine-tuned Model},
+  year = {2024},
+  publisher = {Hugging Face},
+  howpublished = {\url{https://huggingface.co/rajaykumar12959/gemma-2-2b-text-to-sql-qlora}},
+}
+```
+## Acknowledgments
+- **Base Model:** Google's Gemma-2-2B via Unsloth optimization
+- **Dataset:** Gretel AI's synthetic text-to-SQL dataset
+- **Framework:** Unsloth for efficient fine-tuning and TRL for training
+- **Method:** QLoRA for parameter-efficient training
+## License
+This model is licensed under Apache 2.0. See the LICENSE file for details.
+---
+*This model is intended for research and educational purposes. Please ensure compliance with your organization's data and AI usage policies when using in production environments.*