Spaces:

KDDSTLC
/

LFED

Running

App Files Files Community

LFED / docs /HANDOFF.md

Kasualdad

fix: remove remaining fine-tuned model repo ID from HANDOFF.md

9bae1e9 27 days ago

preview code

Raw

History Blame Contribute Delete

8.05 kB

A newer version of the Gradio SDK is available: 6.19.0

Upgrade

Session Handoff — Kasualdad LFED Build

Last updated: 2026-06-06 (Session 2 — Phases 1-8 complete) Deadline: 2026-06-15 (HF Build Small Hackathon — "Backyard AI" chapter) Project root: /Users/flucido/projects/build-small-hackathon/Kasualdad_LFED Original reference project: /Users/flucido/projects/Kasualdad_LFED (untouched snapshot) Reference project: /Users/flucido/projects/local-data-stack (education data stack patterns)

TL;DR — How to Resume in 30 Seconds

cd /Users/flucido/projects/build-small-hackathon/Kasualdad_LFED
source .venv/bin/activate
python -c "import gradio, duckdb, llama_cpp, huggingface_hub; print('env OK')"

# Run the app
python app.py

# Run tests
pytest tests/ -v

# Kick off Modal fine-tuning (after setting HF_TOKEN secret)
modal run modal_train/modal_app.py

Full plan is in docs/PLAN.md.

Project Summary

Kasualdad LFED = a Hugging Face Space that turns natural-language questions from school district admins (principals, superintendents, department heads) into safe DuckDB SQL via a small local LLM (llama.cpp, GGUF format). Targets HF "Build Small" Hackathon badges:

Off the Grid — all inference local, no API calls
Well-Tuned — Modal fine-tune of a small text-to-SQL model pushed to Hub
Llama Champion — llama.cpp as inference backend
Off-Brand — custom CSS targeting cognitive-load reduction (added in second iteration)

User's success criteria: win the hackathon + have a demoable slice for clients/employers.

Decisions Locked (do not re-litigate)

Decision	Value	Source
Project location	`~/projects/build-small-hackathon/Kasualdad_LFED`	User: "this is where the hackathon is happening"
Source of truth	Copy of `~/projects/Kasualdad_LFED`	User choice (Recommended)
Python version	3.12 (local + Space)	User: "ok to recreate"
Base model	`Qwen2.5-Coder-7B-Instruct` Q4_K_M	Plan, 7/7 sanity check passed
Alt model (fallback)	`Llama-3.1-8B-Instruct` Q4_K_M	Plan (only if Qwen fails sanity)
Quantization	Q4_K_M (~4.4 GB)	Plan: speed/accuracy tradeoff for free Space
Code layout	Modular: `app.py` / `data_engine.py` / `model_inference.py` / `prompts.py`	User choice (Recommended)
Data source	Seed tables only for hackathon	User choice (Recommended)
Aesthetic	Linear / Vercel style, minimal monochrome + teal accent	User choice (Recommended)
Accent color	`#14b8a6` (teal)	User choice (Recommended)
Modal timing	Start fine-tune ASAP, end-to-end	User choice (Recommended)
Modal account	`flucido`	User
Modal credits	Hackathon-provided, confirmed	User
Custom CSS	Yes (reinstated after user feedback)	User: "i wnat the design to llook good"

Current State (as of session 2 — Phases 1-8 complete)

✅ Completed

Phase 0: Bootstrap — venv, pinned deps, packages.txt
Phase 1: Model sanity check — Qwen2.5-Coder-7B Q4_K_M: 7/7 prompts passed, model locked
Phase 2: Refactor — modular codebase (app.py, data_engine.py, model_inference.py, prompts.py)
Phase 3: Robustness — JSON envelope parsing, schema-aware column validation (EXPLAIN), threading timeout via conn.interrupt(), per-request DB connections, streaming SQL generation
Phase 4: Seed data — data/generate_seed.py: 5 schools × 4 years × 13 grades, 2,900 students, 15% chronic rate
Phase 5: UI polish — Off-Brand CSS (Inter + JetBrains Mono, teal accent, WCAG AA, prefers-reduced-motion, focus rings), README design docs
Phase 6: Tests — 81 tests, 0 failures (execution guard, data engine, model inference with mocked LLM)
Phase 7: Modal pipeline — modal_train/: synthetic data generator (1,289 pairs, 32 templates), Unsloth QLoRA train.py, GGUF export, Modal orchestration
Phase 8: README — expanded with badges, Mermaid architecture diagram, schema docs, run guide, project structure

🟡 In Progress

Nothing in progress

⏳ Pending

Phase 7b: Actually run Modal training (user needs to create HF_TOKEN secret and modal run)
Phase 7c: Swap REPO_ID in model_inference.py after GGUF is pushed
Phase 9: Deploy to HF Space + smoke test
Phase 10–11: Buffer + polish + submit

File Tree (Current — post Phase 8)

~/projects/build-small-hackathon/Kasualdad_LFED/
├── .venv/                              # Python 3.12.8
├── app.py                              # Gradio UI + Off-Brand CSS (354 lines)
├── data_engine.py                      # DuckDB lifecycle, execution guard, timeout (310 lines)
├── model_inference.py                  # llama.cpp wrapper, streaming (211 lines)
├── prompts.py                          # System prompt, schema docs, few-shot (131 lines)
├── data/
│   └── generate_seed.py                # 5 schools × 4 years seed generator
├── tests/
│   ├── conftest.py                     # Shared fixtures
│   ├── test_execution_guard.py         # 24 tests — SQL injection, validation
│   ├── test_data_engine.py             # 23 tests — schema, isolation, integrity
│   └── test_model_inference.py         # 24 tests — prompt assembly, mock LLM
├── modal_train/
│   ├── generate_synthetic.py           # 1,289 NL→SQL training pairs
│   ├── train.py                        # Unsloth QLoRA recipe
│   ├── export_gguf.py                  # Merge → GGUF → HF Hub
│   ├── modal_app.py                    # Modal orchestration
│   └── train.jsonl                     # Training data (1,289 pairs)
├── docs/
│   ├── HANDOFF.md                      # this file
│   └── PLAN.md                         # full plan
├── requirements.txt                    # PINNED (4 deps)
├── packages.txt                        # System deps for HF Space
└── README.md                           # Expanded (badges, mermaid, schema, run guide)

# External artifacts:
/tmp/lfed-models/qwen/Qwen2.5-Coder-7B-Instruct.Q4_K_M.gguf   # 4.4 GB, ready

Quick-Reference: Commands to Know

Activate environment

cd /Users/flucido/projects/build-small-hackathon/Kasualdad_LFED
source .venv/bin/activate

Test all imports

python -c "import gradio, duckdb, llama_cpp, huggingface_hub; print('env OK')"

Run the app locally

python app.py
# → http://localhost:7860

Run pytest (Phase 6+)

pytest tests/ -v

Modal CLI (Phase 7+)

modal --version                           # verify install
modal secret list                         # list secrets
modal run modal_train/modal_app.py        # kick off training

Outstanding Dependencies

When	Dependency	Status
Now	`modal secret create huggingface HF_TOKEN=<token>`	⏳ User action
After training	Swap `REPO_ID` in `model_inference.py` to the fine-tuned GGUF repo	⏳ After Phase 7c

9-Day Schedule (Updated)

Day	Date	Phases	Status
1	Jun 6	0 · 1 · 2 · 3 · 4 · 5 · 6 · 7 · 8	✅ Done (all 8 phases in one session)
3	Jun 8	7b: Modal training (3-6 hrs overnight on A10G)	⏳ User action needed
4	Jun 9	7c: merge → GGUF → push → swap REPO_ID	⏳ After training
5–6	Jun 10–11	9: Deploy to HF Space + smoke test	⏳
7–8	Jun 12–13	Buffer: polish, re-iterate	⏳
9	Jun 14	Final verify + submit	Submit

Next action: User creates Modal HF_TOKEN secret, then modal run modal_train/modal_app.py.

Reference

Full plan: docs/PLAN.md
HF Build Small Hackathon (track: Backyard AI): submission deadline 2026-06-15
Modal: https://modal.com/apps/flucido/main (credits: hackathon-provided)
HF Hub: fine-tuned GGUF repo (TBD — see Modal export script)