Text Generation
PEFT
Safetensors
Transformers
lora
sft
trl
unsloth
text-to-sql
education
local-first
conversational
Instructions to use build-small-hackathon/lfed-qwen2.5-coder-14b-sql-lora with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- PEFT
How to use build-small-hackathon/lfed-qwen2.5-coder-14b-sql-lora with PEFT:
from peft import PeftModel from transformers import AutoModelForCausalLM base_model = AutoModelForCausalLM.from_pretrained("unsloth/qwen2.5-coder-14b-instruct-bnb-4bit") model = PeftModel.from_pretrained(base_model, "build-small-hackathon/lfed-qwen2.5-coder-14b-sql-lora") - Transformers
How to use build-small-hackathon/lfed-qwen2.5-coder-14b-sql-lora with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="build-small-hackathon/lfed-qwen2.5-coder-14b-sql-lora") messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("build-small-hackathon/lfed-qwen2.5-coder-14b-sql-lora", dtype="auto") - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- vLLM
How to use build-small-hackathon/lfed-qwen2.5-coder-14b-sql-lora with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "build-small-hackathon/lfed-qwen2.5-coder-14b-sql-lora" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "build-small-hackathon/lfed-qwen2.5-coder-14b-sql-lora", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/build-small-hackathon/lfed-qwen2.5-coder-14b-sql-lora
- SGLang
How to use build-small-hackathon/lfed-qwen2.5-coder-14b-sql-lora with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "build-small-hackathon/lfed-qwen2.5-coder-14b-sql-lora" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "build-small-hackathon/lfed-qwen2.5-coder-14b-sql-lora", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "build-small-hackathon/lfed-qwen2.5-coder-14b-sql-lora" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "build-small-hackathon/lfed-qwen2.5-coder-14b-sql-lora", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Unsloth Studio
How to use build-small-hackathon/lfed-qwen2.5-coder-14b-sql-lora with Unsloth Studio:
Install Unsloth Studio (macOS, Linux, WSL)
curl -fsSL https://unsloth.ai/install.sh | sh # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for build-small-hackathon/lfed-qwen2.5-coder-14b-sql-lora to start chatting
Install Unsloth Studio (Windows)
irm https://unsloth.ai/install.ps1 | iex # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for build-small-hackathon/lfed-qwen2.5-coder-14b-sql-lora to start chatting
Using HuggingFace Spaces for Unsloth
# No setup required # Open https://huggingface.co/spaces/unsloth/studio in your browser # Search for build-small-hackathon/lfed-qwen2.5-coder-14b-sql-lora to start chatting
Load model with FastModel
pip install unsloth from unsloth import FastModel model, tokenizer = FastModel.from_pretrained( model_name="build-small-hackathon/lfed-qwen2.5-coder-14b-sql-lora", max_seq_length=2048, ) - Docker Model Runner
How to use build-small-hackathon/lfed-qwen2.5-coder-14b-sql-lora with Docker Model Runner:
docker model run hf.co/build-small-hackathon/lfed-qwen2.5-coder-14b-sql-lora
| base_model: unsloth/qwen2.5-coder-14b-instruct-bnb-4bit | |
| library_name: peft | |
| pipeline_tag: text-generation | |
| tags: | |
| - base_model:adapter:unsloth/qwen2.5-coder-14b-instruct-bnb-4bit | |
| - lora | |
| - sft | |
| - transformers | |
| - trl | |
| - unsloth | |
| - text-to-sql | |
| - education | |
| - local-first | |
| # LFED SQL Assistant — Qwen2.5-Coder-14B-LoRA | |
| A LoRA adapter that turns plain-English school-data questions into read-only DuckDB SQL queries. Built for the **Local First Education Data Framework (LFED)**, a local-first analytics assistant for school administrators. | |
| - **Live demo:** https://huggingface.co/spaces/build-small-hackathon/Kasualdad_LFED | |
| - **GGUF (local/llama.cpp):** https://huggingface.co/build-small-hackathon/lfed-qwen2.5-coder-14b-sql-gguf | |
| - **Project:** https://huggingface.co/spaces/build-small-hackathon/Kasualdad_LFED | |
| --- | |
| ## Model Details | |
| - **Developer:** build-small-hackathon (HF Build Small Hackathon, Chapter One: Backyard AI) | |
| - **Base model:** [`unsloth/qwen2.5-coder-14b-instruct-bnb-4bit`](https://huggingface.co/unsloth/qwen2.5-coder-14b-instruct-bnb-4bit) | |
| - **Model type:** Causal LM, text-to-SQL, instruction-tuned | |
| - **Language:** English (questions and SQL) | |
| - **License:** Apache 2.0 | |
| - **Fine-tuning method:** Unsloth QLoRA (r=32, α=32, target modules `q_proj`, `k_proj`, `v_proj`, `o_proj`, `gate_proj`, `up_proj`, `down_proj`) | |
| - **Training data:** 27,859 synthetic NL→SQL pairs over a 5-table education schema | |
| - **Adapter size:** ~551 MB | |
| - **Framework versions:** PEFT 0.19.1, transformers, Unsloth, TRL | |
| --- | |
| ## Intended Use | |
| This adapter is designed for a single downstream task: **converting natural-language questions about school data into safe, read-only SQL**. | |
| ### Direct use | |
| - Drop the adapter onto the base model for inference in the LFED Gradio app. | |
| - Run locally with the matching GGUF or with `transformers + PEFT`. | |
| ### Suitable questions | |
| - "How many students were chronically absent at Lincoln Elementary in 2023-2024?" | |
| - "What is the suspension rate by race/ethnicity at Washington Middle?" | |
| - "Show the average GPA for English learners vs non-English learners." | |
| - "Which school has the highest enrollment growth since 2021?" | |
| ### Out-of-scope use | |
| - Not a general chatbot or coding assistant. | |
| - Not trained on real student PII; the demo uses synthetic seed data only. | |
| - Not suitable for arbitrary SQL dialects beyond DuckDB-compatible queries. | |
| - Should not be used for write operations; the execution guard allows only `SELECT` statements. | |
| --- | |
| ## Training Details | |
| ### Training data | |
| - **Source:** synthetic data generated from hand-written templates, augmented with Gretel, and rephrased for natural-language variety. | |
| - **Schema:** 5 tables — `students`, `enrollment`, `attendance`, `discipline`, `grades`. | |
| - **Coverage:** single-table aggregations, joins, filtering by school/year/grade, subgroup comparisons, ranking, and simple rates/percentages. | |
| - **Size:** 27,859 question→SQL pairs. | |
| - **Format:** each example contains a `question` and a `sql` field. | |
| - **Data generation scripts:** `modal_train/generate_synthetic_v2.py`, `modal_train/augment_gretel.py`, `modal_train/rephrase_pairs.py` in the project repo. | |
| ### Training procedure | |
| | Setting | Value | | |
| |---|---| | |
| | Optimizer | AdamW (Unsloth default) | | |
| | Learning rate | 1e-4 | | |
| | LR scheduler | cosine | | |
| | Warmup steps | 10 | | |
| | Batch size | 4 | | |
| | Gradient accumulation | 4 | | |
| | Epochs | 2 | | |
| | LoRA r | 32 | | |
| | LoRA α | 32 | | |
| | LoRA dropout | 0 | | |
| | Target modules | all linear layers | | |
| | Quantization | 4-bit (bnb NF4) | | |
| | Max sequence length | 2048 | | |
| | Trainer | SFTTrainer (TRL) | | |
| | Packing | False | | |
| | Hardware | Modal A10G | | |
| Training completed on 2026-06-10. | |
| ### Outputs | |
| | Artifact | Location | | |
| |---|---| | |
| | This LoRA adapter | `build-small-hackathon/lfed-qwen2.5-coder-14b-sql-lora` | | |
| | Merged GGUF Q4_K_M | `build-small-hackathon/lfed-qwen2.5-coder-14b-sql-gguf` | | |
| | Training code | `modal_train/` in the LFED project repo | | |
| --- | |
| ## Evaluation | |
| ### Approach | |
| Evaluation is currently manual: a bank of 15 real-world-style queries spanning attendance, discipline, grades, enrollment, and equity comparisons is run through the LFED demo UI. Each query is scored on: | |
| 1. Correctness — does the answer match the expected aggregation/join? | |
| 2. SQL quality — is the generated SQL valid, safe, and readable? | |
| 3. UX — is the summary + table useful? | |
| 4. Latency — does the query complete within a reasonable time? | |
| ### Known limitations | |
| - The model is fine-tuned on synthetic data; real-world schema variations require additional prompting or fine-tuning. | |
| - It occasionally needs explicit school name and school year in the question to produce the most reliable query. | |
| - Complex multi-step reasoning (e.g., "students who improved GPA across consecutive years") can be brittle. | |
| - Percentage/rate formatting is handled by the downstream app, not the model; the model may return either 0–1 proportions or already-scaled percentages. | |
| - No formal academic benchmark evaluation has been run. | |
| --- | |
| ## How to Use | |
| ### With transformers + PEFT (HF Space path) | |
| ```python | |
| import torch | |
| from transformers import AutoModelForCausalLM, AutoTokenizer | |
| from peft import PeftModel | |
| base_id = "unsloth/qwen2.5-coder-14b-instruct-bnb-4bit" | |
| adapter_id = "build-small-hackathon/lfed-qwen2.5-coder-14b-sql-lora" | |
| tokenizer = AutoTokenizer.from_pretrained(base_id) | |
| model = AutoModelForCausalLM.from_pretrained( | |
| base_id, | |
| torch_dtype=torch.float16, | |
| device_map="auto", | |
| ) | |
| model = PeftModel.from_pretrained(model, adapter_id, torch_device="cpu") | |
| prompt = """You are an assistant that converts school-data questions into DuckDB SQL. | |
| Schema: | |
| - students(student_id, school_name, grade_level, gender, race_ethnicity, english_learner, special_education, economically_disadvantaged) | |
| - attendance(student_id, school_name, school_year, absence_count, is_chronically_absent) | |
| Question: How many chronically absent students at Lincoln Elementary in 2023-2024? | |
| SQL:""" | |
| inputs = tokenizer(prompt, return_tensors="pt").to(model.device) | |
| outputs = model.generate(**inputs, max_new_tokens=128, temperature=0.0) | |
| print(tokenizer.decode(outputs[0], skip_special_tokens=True)) | |
| ``` | |
| ### With llama.cpp (local-first path) | |
| Use the matching GGUF: | |
| ```bash | |
| llama-cli \ | |
| -m lfed-qwen2.5-coder-14b-sql-gguf/ggml-model-q4_k_m.gguf \ | |
| -p "Question: How many chronically absent students at Lincoln Elementary in 2023-2024?\nSQL:" \ | |
| -n 128 --temp 0.0 | |
| ``` | |
| Or run the full LFED app locally: | |
| ```bash | |
| git checkout -b product local-llamacpp-v1 | |
| python3.12 -m venv .venv && source .venv/bin/activate | |
| pip install -r requirements.txt | |
| python app.py | |
| ``` | |
| --- | |
| ## Bias, Risks, and Limitations | |
| - **Synthetic data:** the training data is generated from templates and rephrased. Demographic patterns in the seed data do not represent any real population; they exist to exercise joins and filters. | |
| - **No PII handling training:** the model has no special safeguards around personally identifiable information because the schema uses anonymized `student_id`s only. | |
| - **Read-only enforcement is app-level:** the adapter itself will emit any SQL-like text; the downstream `data_engine.py` validator enforces `SELECT`-only and forbidden-token rules. | |
| - **Hallucinated columns/tables:** the model may occasionally reference a plausible-sounding column that does not exist. The execution guard catches these via schema-aware `EXPLAIN` validation. | |
| - **Numeric accuracy:** the model writes the SQL; percentage interpretation depends on the application layer. Users should verify rates and percentages against their own conventions. | |
| --- | |
| ## Environmental Impact | |
| Estimated training energy use on a Modal A10G for ~2 epochs: | |
| - **Hardware type:** NVIDIA A10G | |
| - **Training time:** approximately 1–2 hours | |
| - **Cloud provider:** Modal | |
| - **Region:** likely US-east (Modal default) | |
| - **Carbon emitted:** not precisely measured; rough estimate using [ML CO2 Impact calculator](https://mlco2.github.io/impact#compute) is on the order of tens to low-hundreds of grams of CO2eq. | |
| --- | |
| ## Citation | |
| If you use this model, please cite the base model and the LFED project: | |
| **BibTeX:** | |
| ```bibtex | |
| @misc{lfed_sql_adapter, | |
| title={Local First Education Data Framework: A Qwen2.5-Coder-14B LoRA Adapter for School-Data Text-to-SQL}, | |
| author={build-small-hackathon}, | |
| year={2026}, | |
| howpublished={\url{https://huggingface.co/build-small-hackathon/lfed-qwen2.5-coder-14b-sql-lora}} | |
| } | |
| ``` | |
| **APA:** | |
| build-small-hackathon. (2026). *Local First Education Data Framework: A Qwen2.5-Coder-14B LoRA adapter for school-data text-to-SQL*. Hugging Face. https://huggingface.co/build-small-hackathon/lfed-qwen2.5-coder-14b-sql-lora | |
| --- | |
| ## Framework versions | |
| - PEFT 0.19.1 | |