⚠️ Conference talk demo β€” not production weights.

This model accompanies a conference keynote on local on-device AI. Published as a reference for the fine-tuning patterns shown on stage β€” not a deployable artefact. No security audit, no SLA, pinned to the talk's state.


Gemma3-1B FT (f16) β€” Direct Answer + Tool-Result Synthesis (intent fallback)

Base model google/gemma-3-1b-it (1.0B params)
License Gemma Terms of Use β€” see MODEL_LICENSES.md
Training script finetune/train_gemma3.py
Method LoRA r=8, Ξ±=16, 7 epochs, lr=1e-4
Training data data/training-data/gemma3_intent_{scenario}.jsonl + gemma3_synthesis_{scenario}.jsonl
Hardware tested RTX PRO 6000 (CUDA), Apple M-series (Metal/MPS fallback)
Intended use (1) Direct-answer responses (no-retrieval-needed questions, chitchat); (2) tool-result synthesis (turning JSON tool outputs into natural-language answers); (3) generative intent-classification fallback when the LogReg classifier isn't loaded. Primary intent path is the LogReg classifier β€” it handles ~93% of traffic deterministically in <25ms.
Out of scope RAG synthesis from multi-document context (cross-contaminates facts β€” use Gemma3-4B FT for that). General-purpose chat. Multilingual beyond the scenario's training language.
Reference eval (Nextera) Intent (fallback path): 96.7% / 174 of 180 (post-2026-05-15 retrain on corrected training data; pre-retrain was 93.3%). Direct-answer: 95%+. See docs/benchmarks/FINE_TUNING_INSIGHTS.md Β§10.
Known failure modes On the fallback intent path: confuses tool_use vs rag_query when the question contains a number AND a topical noun (e.g. "What's the cost of GDPR compliance?"). LogReg primary handles these cleanly.
Downloads last month
52
GGUF
Model size
1.0B params
Architecture
gemma3
Hardware compatibility
Log In to add your hardware

16-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for thinktecture/gemma3-1b-ft-nextera-f16

Quantized
(195)
this model