JaneGPT v2 — Intent Classification Model

A lightweight, fast, and accurate intent classification model built from scratch for virtual assistant command understanding.

7.8M parameters | 22 intent classes | 88.6% validation accuracy | ~17ms inference on GPU

Why I Built This

I'm building JANE — a fully offline, privacy-first AI voice assistant. Llama 3 8B was causing 10–22 second delays for simple commands like "turn up the volume."

That's not a voice assistant. That's a waiting game.

So I designed JaneGPT v2 from scratch — a model that does exactly one job, does it fast, and runs on consumer hardware without any cloud dependency.

Model Details

Property	Value
Architecture	Decoder-only Transformer + Classification Head
Parameters	~7.8M
Embedding dim	256
Attention heads	8
KV heads (GQA)	4
Layers	8
FF hidden dim	672
Max sequence length	256
Vocab size	8,192
Tokenizer	Custom BPE
Training accuracy	~96.7%
Validation accuracy	88.6%
Checkpoint size	~30MB

Architecture Decisions & Why

Choice	Reason
GQA (4 KV heads, 8 attention heads)	Reduces memory without losing expressiveness
RoPE positional encoding	Better length generalization than learned embeddings
SwiGLU activation	Smoother gradients than ReLU at this model size
RMSNorm	Simpler and faster than LayerNorm
Custom BPE tokenizer	Trained specifically on command-style text

Supported Intents (22 classes)

Category	Intents
Volume	`volume_up`, `volume_down`, `volume_set`, `volume_mute`
Brightness	`brightness_up`, `brightness_down`, `brightness_set`
Media	`media_play`, `media_pause`, `media_next`, `media_previous`
Apps	`app_launch`, `app_close`, `app_switch`
Browser	`browser_search`
Productivity	`set_reminder`, `screenshot`
Screen	`read_screen`, `explain_screen`
Control	`undo`, `quit_jane`
Conversation	`chat`

Performance

Input	Predicted Intent	Confidence
"increase the volume"	volume_up	86%
"make it louder"	volume_up	90%
"turn down the brightness"	brightness_down	80%
"open chrome"	app_launch	98%
"play some music"	media_play	96%
"search for cats on youtube"	browser_search	94%
"set a reminder for 5 minutes"	set_reminder	96%
"take a screenshot"	screenshot	88%
"undo that"	undo	92%
"hello"	chat	97%

Quick Start

Installation

git clone https://huggingface.co/RavinduSen/JaneGPT-v2
cd JaneGPT-v2
pip install -r requirements.txt

Basic Usage

from classifier import JaneGPTClassifier

classifier = JaneGPTClassifier()

intent, confidence = classifier.predict("turn up the volume")
print(f"Intent: {intent}, Confidence: {confidence:.2%}")
# Output: Intent: volume_up, Confidence: 86.10%

intent, confidence = classifier.predict("open chrome")
print(f"Intent: {intent}, Confidence: {confidence:.2%}")
# Output: Intent: app_launch, Confidence: 98.10%

With Conversation Context

intent, confidence = classifier.predict(
    "not enough",
    context={"last_intent": "volume_up"}
)
# Output: Intent: volume_up, Confidence: 79.00%

Training Setup

Component	Details
Hardware	NVIDIA RTX 3050Ti (4GB VRAM)
CPU	AMD Ryzen 9 5900HX
RAM	16GB
Additional	Google Colab (extended training runs)
Framework	PyTorch 2.0+
Training data	Custom command dataset (claude assisted generation under author supervision)

Limitations

Intent classification only — does not generate text
22 classes — commands outside supported set classified as chat
English only
Optimized for short inputs (1–15 words)
No entity extraction — returns intent label only

Use Cases

Virtual assistant command routing
Smart home intent classification
Voice command understanding
Chatbot intent detection
Edge device deployment (small enough for embedded systems)

Part of the JANE Project

This model is the intelligence core of JANE — a fully offline, privacy-first AI voice assistant.

🔗 JANE AI Assistant on GitHub 🔗 JaneGPT-v2 on GitHub

Created By

Ravindu Senanayake — Computer Science Undergraduate, Sri Lanka

Built from scratch — architecture, tokenizer, and training pipeline designed and implemented by the author.

Downloads last month: 3

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support