File size: 2,904 Bytes
852a840 874f657 852a840 a0ea55c 874f657 a0ea55c 852a840 874f657 852a840 a0ea55c 852a840 874f657 852a840 874f657 852a840 a0ea55c 874f657 a0ea55c 874f657 a0ea55c 874f657 a0ea55c 874f657 a0ea55c 874f657 a0ea55c 874f657 a0ea55c 874f657 a0ea55c 874f657 a0ea55c 874f657 a0ea55c 874f657 a0ea55c 874f657 a0ea55c 874f657 a0ea55c 874f657 a0ea55c 874f657 a0ea55c 874f657 a0ea55c 874f657 a0ea55c 874f657 a0ea55c 874f657 a0ea55c 874f657 a0ea55c 874f657 a0ea55c 874f657 a0ea55c 874f657 a0ea55c 874f657 a0ea55c 874f657 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 |
---
base_model: unsloth/meta-llama-3.1-8b-bnb-4bit
pipeline_tag: text-generation
tags:
- text-generation
- sql-generation
- llama
- lora
- peft
- unsloth
- transformers
license: apache-2.0
language:
- en
---
# SQL-Genie (LLaMA-3.1-8B Fine-Tuned)
## 🧠 Model Overview
**SQL-Genie** is a fine-tuned version of **LLaMA-3.1-8B**, specialized for converting **natural language questions into SQL queries**.
The model was trained using **parameter-efficient fine-tuning (LoRA)** on a structured SQL instruction dataset, enabling strong SQL generation performance while remaining lightweight and affordable to train on limited compute (Google Colab).
- **Developed by:** dhashu
- **Base model:** `unsloth/meta-llama-3.1-8b-bnb-4bit`
- **License:** Apache-2.0
- **Training stack:** Unsloth + Hugging Face TRL
---
## ⚙️ Training Methodology
This model was trained using **LoRA (Low-Rank Adaptation)** via the **PEFT** framework.
### Key Details
- Base model loaded in **4-bit quantization** for memory efficiency
- **Base weights frozen**
- **LoRA adapters** applied to:
- Attention layers (`q_proj`, `k_proj`, `v_proj`, `o_proj`)
- Feed-forward layers (`gate_proj`, `up_proj`, `down_proj`)
- Fine-tuned using **Supervised Fine-Tuning (SFT)**
This approach allows efficient specialization without full model retraining.
---
## 📊 Dataset
The model was trained on a subset of the **`b-mc2/sql-create-context`** dataset, which includes:
- Natural language questions
- Database schema / context
- Corresponding SQL queries
Each sample was formatted as an **instruction-style prompt** to improve reasoning and structured output.
---
## 🚀 Performance & Efficiency
- 🚀 **2× faster fine-tuning** using Unsloth
- 💾 **Low VRAM usage** via 4-bit quantization
- 🧠 Improved SQL syntax and schema understanding
- ⚡ Suitable for real-time inference and lightweight deployments
---
## 🧩 Model Variants
This repository contains a **merged model**:
### 🔹 Merged 4-bit Model
- LoRA adapters merged into base weights
- No PEFT required at inference time
- Ready-to-use single checkpoint
- Optimized for easy deployment
---
## ▶️ How to Use (Inference)
```python
from transformers import AutoTokenizer, AutoModelForCausalLM
import torch
model_id = "dhashu/sql-genie-full"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
model_id,
device_map="auto",
load_in_4bit=True,
)
prompt = """Below is an input question, context is given to help. Generate a SQL response.
### Input: List all employees hired after 2020
### Context: CREATE TABLE employees(id, name, hire_date)
### SQL Response:
"""
inputs = tokenizer(prompt, return_tensors="pt").to("cuda")
outputs = model.generate(
**inputs,
max_new_tokens=128,
temperature=0.7,
)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
|