MSG
Feat/last sprint (#12)
871f869
|
Raw
History Blame Contribute Delete
1.65 kB

A newer version of the Gradio SDK is available: 6.19.0

Upgrade

Research

Experimental code for fine-tuning and agentic benchmarks. Nothing here is wired into the Gradio Lesson Agent by default β€” use it to train models and score checkpoints against public benchmarks.

Path Purpose
finetune.py LoRA / QLoRA / full fine-tune on chat or instruction data
evals/ SLM agentic benchmark suite β€” BFCL, Ο„-bench, GAIA, SWE-bench (uv package slm-evals)
data/ Shared JSONL datasets for finetune and evals

Quick links

Install (from repo root)

# All research tooling
uv sync --group finetune --group evals --group lm-eval

Individual groups:

Group Command Enables
finetune uv sync --group finetune research/finetune.py (LoRA, QLoRA, merge)
evals uv sync --group evals research/evals/ package (slm-benchmark)
lm-eval uv sync --group lm-eval slm-lm-eval CLI (GSM8K, ARC, HellaSwag, …)

Typical workflow

research/data/education-lesson-chat.jsonl
        β”‚
        β–Ό
  research/finetune.py  ──►  models/finetuned/<preset>-lora/
        β”‚
        └──► research/evals/  (BFCL, Ο„-bench, GAIA, SWE-bench, lm-eval)

See USAGE.md for copy-paste commands.