Instructions to use shreyash-pandey-katni/SQLForge-Mistral-7B-QLoRA with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- PEFT
How to use shreyash-pandey-katni/SQLForge-Mistral-7B-QLoRA with PEFT:
from peft import PeftModel from transformers import AutoModelForCausalLM base_model = AutoModelForCausalLM.from_pretrained("mistralai/Mistral-7B-v0.3") model = PeftModel.from_pretrained(base_model, "shreyash-pandey-katni/SQLForge-Mistral-7B-QLoRA") - Notebooks
- Google Colab
- Kaggle
SQLForge-Mistral-7B-QLoRA
SQLForge is a QLoRA adapter for mistralai/Mistral-7B-v0.3
that forges natural-language questions into executable SQL. Trained on a mixture of
Spider and b-mc2/sql-create-context (~90k examples), it brings exact-match
accuracy on an internal text-to-SQL test split from 9.2% → 87.0% over the base model,
while keeping the adapter footprint under 340 MB.
Results
Evaluated on a held-out internal text-to-SQL test set (500 examples, schema-aware prompt):
| Model | Exact Set Match | Valid SQL |
|---|---|---|
mistralai/Mistral-7B-v0.3 (base) |
9.2% | 86.2% |
| SQLForge-Mistral-7B-QLoRA | 87.0% | 97.4% |
A 9.5× improvement in exact-match accuracy and a +11.2 pp lift in syntactic validity, delivered by a 336 MB LoRA adapter.
Usage
import torch
from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PeftModel
BASE = "mistralai/Mistral-7B-v0.3"
ADAPT = "shreyash-pandey-katni/SQLForge-Mistral-7B-QLoRA"
tok = AutoTokenizer.from_pretrained(BASE)
model = AutoModelForCausalLM.from_pretrained(
BASE,
torch_dtype=torch.bfloat16,
device_map="auto",
)
model = PeftModel.from_pretrained(model, ADAPT)
model.eval()
schema = "CREATE TABLE employees (id INT, name TEXT, dept TEXT, salary REAL);"
question = "List the names of employees in the Engineering department earning over 100000."
prompt = f"""[INST] You are an expert SQL assistant. Given the schema and question, write a single SQL query.
Schema:
{schema}
Question: {question} [/INST]"""
inputs = tok(prompt, return_tensors="pt").to(model.device)
out = model.generate(**inputs, max_new_tokens=128, do_sample=False)
print(tok.decode(out[0][inputs["input_ids"].shape[1]:], skip_special_tokens=True))
4-bit inference (single 12 GB GPU)
from transformers import BitsAndBytesConfig
bnb = BitsAndBytesConfig(
load_in_4bit=True,
bnb_4bit_quant_type="nf4",
bnb_4bit_compute_dtype=torch.bfloat16,
)
model = AutoModelForCausalLM.from_pretrained(BASE, quantization_config=bnb, device_map="auto")
model = PeftModel.from_pretrained(model, ADAPT)
Training Details
- Base model:
mistralai/Mistral-7B-v0.3 - Method: QLoRA (4-bit NF4 base, BF16 LoRA adapters) via
trl.SFTTrainer - Datasets:
spider— cross-domain text-to-SQLb-mc2/sql-create-context— schema-grounded SQL- Combined: 90,719 train / 3,929 val rows
- Prompt format:
[INST] … [/INST](Mistral instruction template). Loss is masked to apply only on tokens after[/INST](the SQL response). - Max sequence length: 512 tokens
LoRA configuration
| Setting | Value |
|---|---|
Rank r |
64 |
| Alpha | 16 (scale = α/r = 0.25, conservative — preserves reasoning) |
| Dropout | 0.05 |
| RSLoRA | enabled (1/√r scaling for r=64 stability) |
| Target modules | q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj |
| Bias | none |
Training hyperparameters
| Setting | Value |
|---|---|
| Epochs | 3 |
| Effective batch size | 8 (1 × 8 grad-accum) |
| Learning rate | 2e-4 |
| LR scheduler | cosine |
| Warmup ratio | 0.03 |
| Precision | BF16 |
| Gradient checkpointing | enabled |
| Optimizer | AdamW (transformers default) |
| Checkpoint | checkpoint-34020 (final) |
Hardware
- GPU: 1× NVIDIA RTX 3080 Ti (12 GB VRAM)
- Peak VRAM: ~8.25 GB (4-bit NF4 base + BF16 adapters + grad-ckpt)
- Training time: ~4–8 hours
Limitations
- Schema sensitivity: trained with explicit
CREATE TABLEschemas in the prompt; performance degrades without schema context. - Single-query bias: optimized for single SELECT statements; multi-statement scripts, DDL, and procedural SQL are out of scope.
- Dialect: training data is primarily SQLite/ANSI; vendor-specific syntax (T-SQL, PL/pgSQL, BigQuery functions) may not generalize.
- English-only: prompts and questions are English; non-English NL queries untested.
- No execution safety: generated SQL should be sandboxed/parameterized before running against production databases.
Evaluation
The metrics above use exact_set_match (gold and predicted normalized SQL match exactly
as token sets) and valid_sql_pct (predicted SQL parses with sqlparse/sqlite3).
See evaluate_sql.py in the training repository
for the full eval harness.
Files
| File | Purpose |
|---|---|
adapter_model.safetensors |
LoRA weights (~336 MB) |
adapter_config.json |
PEFT/LoRA configuration |
tokenizer.json / tokenizer_config.json |
Mistral tokenizer |
optimizer.pt, scheduler.pt, rng_state.pth, training_args.bin, trainer_state.json |
Training-state snapshot (resume-capable) |
License
Released under Apache 2.0, the same license as the base model
mistralai/Mistral-7B-v0.3.
Citation
@misc{sqlforge_mistral7b_qlora_2026,
title = {SQLForge-Mistral-7B-QLoRA: A QLoRA Text-to-SQL Adapter for Mistral-7B-v0.3},
author = {Shreyash Pandey Katni},
year = {2026},
url = {https://huggingface.co/shreyash-pandey-katni/SQLForge-Mistral-7B-QLoRA}
}
- Downloads last month
- -
Model tree for shreyash-pandey-katni/SQLForge-Mistral-7B-QLoRA
Base model
mistralai/Mistral-7B-v0.3