Spaces:
Running on Zero
Running on Zero
| # Quest-classification LoRA | |
| The dashboard refresh asks MiniCPM5-1B to classify every crawled hackathon project | |
| against the Build Small Hackathon judging dimensions and to quote short evidence for | |
| each match. A pure prompt drifts (truncated JSON, renamed quests, runaway evidence), | |
| so we fine-tune a small LoRA that fixes the task to a strict JSON contract. The | |
| backend still validates every refresh and refuses to swap the dashboard on a schema | |
| failure; the checked-in adapter is the product default, and validation is the | |
| correctness gate. | |
| ## Label space (`hackathon_advisor/quest_taxonomy.py`) | |
| 13 dimensions, each detectable from a README and an app file: | |
| - Six merit badges: Off the Grid, Well-Tuned, Off-Brand, Llama Champion, Sharing is | |
| Caring, Field Notes. | |
| - Two main tracks: Backyard AI, Thousand Token Wood. | |
| - Sponsor / special awards: OpenBMB, Nemotron, Modal, Tiny Titan, Best Agent. | |
| Output schema (one JSON object, nothing else): | |
| ```json | |
| {"matches": [{"quest": "...", "confidence": 0.0, "evidence": "...", "source": "readme|app_file"}]} | |
| ``` | |
| `render_quest_prompt` is the single prompt renderer shared by the dataset and the | |
| live analyzer, so the model sees the same two-segment shape (README + APP_FILE) at | |
| train and inference time, with the same `QUEST_SYSTEM_PROMPT`. | |
| ## Dataset pipeline | |
| 1. `scripts/build_quest_corpus.py` — download the real README.md and main app-file | |
| source for all 125 crawled projects into `data/quest_corpus.json`. | |
| 2. Selection (`data/quest_selection.json`) — drop near-identical template clones | |
| (embedding cosine + identical app hash) and the shortest, signal-free tail; | |
| 108 content-rich projects survive (app-only / readme-only / both profiles kept). | |
| 3. Teacher labelling — a multi-agent workflow labels each project (one agent) then | |
| adversarially verifies and corrects it (a second agent): drops matches whose | |
| evidence is not in the cited segment, fixes `source`, kills Off-the-Grid on a | |
| cloud-API app, kills Tiny Titan on >4B models. Output: `data/quest_labels/labeled.json`. | |
| 4. `scripts/build_quest_sft.py` — one natural example per project plus targeted | |
| augmentations so every case is represented: app-only, readme-only / missing app | |
| file, README↔app contradictions, empty matches, noisy metadata, app-only variants | |
| of the real remote-inference projects, and hand-authored contrastive **hard | |
| negatives** (a remote inference call — `InferenceClient`, HF Inference Endpoints, | |
| replicate, `*.modal.run` — must not earn Off the Grid; OpenBMB belongs only to | |
| `openbmb`/MiniCPM models; Tiny Titan only to ≤4B). `_check_invariants` fails the | |
| build on either crisp violation. Writes `data/quest_sft.jsonl`. | |
| 185 chat-JSONL examples (108 natural + 77 augmented), 27 with empty matches, all 13 | |
| quests covered. The contrastive negatives are up-weighted in training so they outweigh | |
| the strong Off-the-Grid prior that, untreated, mislabels remote-API chatbots as local. | |
| Published as a Hub dataset: | |
| [`build-small-hackathon/hackathon-advisor-quest-dataset`](https://huggingface.co/datasets/build-small-hackathon/hackathon-advisor-quest-dataset) | |
| (`scripts/publish_quest_dataset.py`). The trained adapter lives at | |
| [`build-small-hackathon/hackathon-advisor-quest-minicpm5-lora`](https://huggingface.co/build-small-hackathon/hackathon-advisor-quest-minicpm5-lora). | |
| ## Training (`scripts/modal_train_quest_lora.py`) | |
| ```bash | |
| modal run scripts/modal_train_quest_lora.py::smoke # check the GPU | |
| modal run scripts/modal_train_quest_lora.py --dataset data/quest_sft.jsonl --epochs 16 | |
| ``` | |
| LoRA SFT on an **L40S**: rank 64, alpha 128, dropout 0, completion-only loss (the | |
| prompt is masked to -100 so only the strict JSON is supervised), `max_seq_length=3072`, | |
| chat template with `enable_thinking=False` to match inference. The dataset is the spec, | |
| so the container **evaluates on the whole dataset** — quest-set exact match, micro | |
| P/R/F1, and a mismatch list — and returns the adapter as a zip unpacked under | |
| `artifacts/quest-lora/`. The shipped adapter scores quest-set exact match 185/185 | |
| (F1 1.0): every dataset project, including the remote-inference ones, is judged correctly. | |
| ## Serving | |
| `MiniCPMQuestAnalyzer` loads the checked-in `artifacts/quest-lora` adapter by | |
| default. `ADVISOR_QUEST_ADAPTER_ID` and `ADVISOR_QUEST_ADAPTER_REVISION` may point | |
| to a replacement adapter, while an explicit empty adapter id runs the base model | |
| for controlled experiments. `validate_matches_by_project` enforces the schema | |
| before the dashboard is swapped. | |