--- library_name: transformers base_model: TinyLlama/TinyLlama-1.1B-Chat-v1.0 tags: - text-to-sql - sql-generation - lora - qlora - wikisql - tinyllama - transformers license: apache-2.0 --- SQL Assistant – TinyLlama Fine-Tuned on WikiSQL (QLoRA) Model Overview This model is a schema-aware Text-to-SQL generator built by fine-tuning TinyLlama-1.1B using QLoRA on the WikiSQL dataset. It converts natural language questions into structured SQL queries given a database schema. The model has been adapted specifically for: SQL generation Schema-conditioned reasoning Structured query formatting Reduced hallucination compared to base model Model Details Developed by: Ruben S Model type: Causal Language Model (LoRA fine-tuned adapter) Base model: TinyLlama-1.1B-Chat-v1.0 Task: Text-to-SQL generation Language: English Training dataset: WikiSQL (10,000 samples subset) Fine-tuning method: QLoRA (4-bit quantization + LoRA adapters) Epochs: 3 Final training loss: 0.52 Hardware: Google Colab T4 GPU License: Apache 2.0 (inherits from base model) This repository contains only the LoRA adapter weights. The base model must be loaded separately. Intended Use Direct Use This model is intended for: Converting natural language queries into SQL Educational and research use SQL assistant systems Demonstrations of parameter-efficient fine-tuning Example use cases: "Find employees with salary greater than 50000" "What is the average price of products in Electronics category?" Downstream Use The model can be integrated into: Database query assistants Data analytics dashboards Backend services that translate user questions into SQL Chat-based data exploration tools Out-of-Scope Use This model is not suitable for: Production-grade database security systems Financial or safety-critical systems Complex multi-table join reasoning (not trained on Spider) SQL injection protection It was trained on single-table WikiSQL-style queries. Training Details Training Data Dataset: WikiSQL (publicly available dataset) 10,000 training samples 1,000 validation samples Single-table SQL queries Aggregations: MAX, MIN, COUNT, SUM, AVG WHERE clause conditions SQL queries were reconstructed from parsed format into full SQL strings before training. Training Procedure Base Model: TinyLlama-1.1B-Chat-v1.0 Quantization: 4-bit (bitsandbytes) Fine-tuning: LoRA (Parameter Efficient Fine-Tuning) Trainable parameters: ~1% of total model parameters Objective: Causal Language Modeling (next-token prediction) Labels set equal to input_ids Training Hyperparameters Epochs: 3 Batch size: 4 Gradient accumulation steps: 4 Learning rate: 1e-4 Precision: FP16 Optimizer: AdamW (default Trainer optimizer) Evaluation Qualitative Evaluation The fine-tuned model was compared against the base TinyLlama model. Improvements observed: Removal of chat-style explanations No markdown formatting Reduced hallucinated table names Improved aggregation selection (AVG, COUNT, etc.) Better multi-condition WHERE clauses Example: Input: Find the average price of products in Electronics category. Output: SELECT AVG(price) FROM table WHERE category = 'Electronics' Limitations Trained only on WikiSQL (single-table queries) Limited support for JOIN operations Numeric formatting inconsistencies may occur (e.g., quoting numbers) Sensitive to schema formatting structure How to Use from transformers import AutoModelForCausalLM, AutoTokenizer from peft import PeftModel import torch base_model = AutoModelForCausalLM.from_pretrained( "TinyLlama/TinyLlama-1.1B-Chat-v1.0", torch_dtype=torch.float16, device_map="auto" ) model = PeftModel.from_pretrained( base_model, "YOUR_USERNAME/sql-assistant-tinyllama-wikisql-qlora" ) tokenizer = AutoTokenizer.from_pretrained( "TinyLlama/TinyLlama-1.1B-Chat-v1.0" ) prompt = """### Instruction: Convert natural language to SQL using the given schema. ### Schema: Table columns: product, price, category, rating ### Question: Find the average price of products in Electronics category. ### SQL: """ inputs = tokenizer(prompt, return_tensors="pt").to("cuda") output = model.generate( **inputs, max_new_tokens=80, temperature=0.1, do_sample=True ) print(tokenizer.decode(output[0], skip_special_tokens=True)) Environmental Impact Hardware: NVIDIA T4 GPU Training Time: ~2 hours total Cloud Provider: Google Colab Precision: FP16 + 4-bit quantization Parameter-efficient fine-tuning significantly reduces compute and memory usage compared to full fine-tuning. Future Work Extend training to Spider dataset (multi-table joins) Add execution-based evaluation Improve numeric formatting consistency Add schema-aware table naming Citation If you use this model, please cite: TinyLlama-1.1B-Chat-v1.0 WikiSQL Dataset Contact: rubansendhur78409@cit.edu.in