---
library_name: transformers
base_model: TinyLlama/TinyLlama-1.1B-Chat-v1.0
tags:
- text-to-sql
- sql-generation
- lora
- qlora
- wikisql
- tinyllama
- transformers
license: apache-2.0
---

SQL Assistant – TinyLlama Fine-Tuned on WikiSQL (QLoRA)
Model Overview

This model is a schema-aware Text-to-SQL generator built by fine-tuning TinyLlama-1.1B using QLoRA on the WikiSQL dataset.

It converts natural language questions into structured SQL queries given a database schema.

The model has been adapted specifically for:

SQL generation

Schema-conditioned reasoning

Structured query formatting

Reduced hallucination compared to base model

Model Details

Developed by: Ruben S

Model type: Causal Language Model (LoRA fine-tuned adapter)

Base model: TinyLlama-1.1B-Chat-v1.0

Task: Text-to-SQL generation

Language: English

Training dataset: WikiSQL (10,000 samples subset)

Fine-tuning method: QLoRA (4-bit quantization + LoRA adapters)

Epochs: 3

Final training loss: 0.52

Hardware: Google Colab T4 GPU

License: Apache 2.0 (inherits from base model)

This repository contains only the LoRA adapter weights.
The base model must be loaded separately.

Intended Use
Direct Use

This model is intended for:

Converting natural language queries into SQL

Educational and research use

SQL assistant systems

Demonstrations of parameter-efficient fine-tuning

Example use cases:

"Find employees with salary greater than 50000"

"What is the average price of products in Electronics category?"

Downstream Use

The model can be integrated into:

Database query assistants

Data analytics dashboards

Backend services that translate user questions into SQL

Chat-based data exploration tools

Out-of-Scope Use

This model is not suitable for:

Production-grade database security systems

Financial or safety-critical systems

Complex multi-table join reasoning (not trained on Spider)

SQL injection protection

It was trained on single-table WikiSQL-style queries.

Training Details
Training Data

Dataset: WikiSQL (publicly available dataset)

10,000 training samples

1,000 validation samples

Single-table SQL queries

Aggregations: MAX, MIN, COUNT, SUM, AVG

WHERE clause conditions

SQL queries were reconstructed from parsed format into full SQL strings before training.

Training Procedure

Base Model: TinyLlama-1.1B-Chat-v1.0

Quantization: 4-bit (bitsandbytes)

Fine-tuning: LoRA (Parameter Efficient Fine-Tuning)

Trainable parameters: ~1% of total model parameters

Objective: Causal Language Modeling (next-token prediction)

Labels set equal to input_ids

Training Hyperparameters

Epochs: 3

Batch size: 4

Gradient accumulation steps: 4

Learning rate: 1e-4

Precision: FP16

Optimizer: AdamW (default Trainer optimizer)

Evaluation
Qualitative Evaluation

The fine-tuned model was compared against the base TinyLlama model.

Improvements observed:

Removal of chat-style explanations

No markdown formatting

Reduced hallucinated table names

Improved aggregation selection (AVG, COUNT, etc.)

Better multi-condition WHERE clauses

Example:

Input:

Find the average price of products in Electronics category.


Output:

SELECT AVG(price) FROM table WHERE category = 'Electronics'

Limitations

Trained only on WikiSQL (single-table queries)

Limited support for JOIN operations

Numeric formatting inconsistencies may occur (e.g., quoting numbers)

Sensitive to schema formatting structure

How to Use
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
import torch

base_model = AutoModelForCausalLM.from_pretrained(
    "TinyLlama/TinyLlama-1.1B-Chat-v1.0",
    torch_dtype=torch.float16,
    device_map="auto"
)

model = PeftModel.from_pretrained(
    base_model,
    "YOUR_USERNAME/sql-assistant-tinyllama-wikisql-qlora"
)

tokenizer = AutoTokenizer.from_pretrained(
    "TinyLlama/TinyLlama-1.1B-Chat-v1.0"
)

prompt = """### Instruction:
Convert natural language to SQL using the given schema.

### Schema:
Table columns: product, price, category, rating

### Question:
Find the average price of products in Electronics category.

### SQL:
"""

inputs = tokenizer(prompt, return_tensors="pt").to("cuda")

output = model.generate(
    **inputs,
    max_new_tokens=80,
    temperature=0.1,
    do_sample=True
)

print(tokenizer.decode(output[0], skip_special_tokens=True))

Environmental Impact

Hardware: NVIDIA T4 GPU

Training Time: ~2 hours total

Cloud Provider: Google Colab

Precision: FP16 + 4-bit quantization

Parameter-efficient fine-tuning significantly reduces compute and memory usage compared to full fine-tuning.

Future Work

Extend training to Spider dataset (multi-table joins)

Add execution-based evaluation

Improve numeric formatting consistency

Add schema-aware table naming

Citation

If you use this model, please cite:

TinyLlama-1.1B-Chat-v1.0
WikiSQL Dataset

Contact: rubansendhur78409@cit.edu.in