--- base_model: microsoft/Phi-3-mini-4k-instruct library_name: transformers license: mit language: - en datasets: - b-mc2/sql-create-context tags: - sql - text-to-sql - code-generation - phi-3 - fine-tuned - text-generation - phi3 pipeline_tag: text-generation --- # Phi-3 Mini SQL Generator — Merged Model Merged standalone version of [Shizu0n/phi3-mini-sql-generator](https://huggingface.co/Shizu0n/phi3-mini-sql-generator) — LoRA adapter weights fused into [Phi-3-mini-4k-instruct](https://huggingface.co/microsoft/Phi-3-mini-4k-instruct). No PEFT dependency required for inference. ## Evaluation — Base vs Fine-tuned Evaluated on 200 held-out examples from [b-mc2/sql-create-context](https://huggingface.co/datasets/b-mc2/sql-create-context). | Model | Exact Match | |---|---| | Phi-3-mini-4k-instruct (base) | 2.0% | | **This model (fine-tuned)** | **73.5%** | > Exact match: normalized SQL comparison (lowercase, strip whitespace/semicolons). ## Why two versions? | Repo | Purpose | |---|---| | [`Shizu0n/phi3-mini-sql-generator`](https://huggingface.co/Shizu0n/phi3-mini-sql-generator) | QLoRA adapter — documents the training pipeline | | `Shizu0n/phi3-mini-sql-generator-merged` | Merged standalone — used for deployment and inference | ## Training Details - **Dataset:** b-mc2/sql-create-context — 1,000 train / 200 validation examples - **Method:** QLoRA (4-bit NF4, LoRA rank 16, alpha 32, target modules: qkv_proj/o_proj/gate_up_proj/down_proj) - **Hardware:** NVIDIA T4 (Google Colab free tier) - **Training time:** ~21 min - **Final train loss:** 0.6526 - **Best checkpoint:** step 250 (by eval loss) ## Inference Example ```python import torch from transformers import AutoModelForCausalLM, AutoTokenizer model_id = "Shizu0n/phi3-mini-sql-generator-merged" tokenizer = AutoTokenizer.from_pretrained(model_id, trust_remote_code=True) model = AutoModelForCausalLM.from_pretrained( model_id, torch_dtype=torch.float16, device_map="auto", trust_remote_code=False, attn_implementation="eager", ) model.eval() prompt = ( "Given the following SQL table, write a SQL query.\n\n" "Table: employees (id, name, department, salary)\n\n" "Question: What is the average salary per department?\n\nSQL:" ) inputs = tokenizer(prompt, return_tensors="pt").to(model.device) with torch.inference_mode(): outputs = model.generate( **inputs, max_new_tokens=80, do_sample=False, use_cache=False, repetition_penalty=1.1, pad_token_id=tokenizer.eos_token_id, ) prompt_len = inputs["input_ids"].shape[-1] print(tokenizer.decode(outputs[0][prompt_len:], skip_special_tokens=True)) ``` Expected output: ```sql SELECT AVG(salary), department FROM employees GROUP BY department ``` ## Validation Merge accepted after three smoke tests: 1. PEFT adapter loaded on base model 2. Local merged directory after `merge_and_unload()` + `save_pretrained()` 3. Downloaded from this repo with `force_download=True` ## Limitations - Fine-tuned on 1,000 examples — best suited for simple to medium complexity SELECT queries - Not tested on dialect-specific SQL (PostgreSQL/MySQL-specific functions) - May struggle with multi-table JOINs and nested subqueries