You need to agree to share your contact information to access this model
This repository is publicly accessible, but you have to accept the conditions to access its files and content.
By requesting access you agree to the CC BY-NC 4.0 license terms. Commercial use requires a separate licence from IIT Delhi.
Log in or Sign Up to review the conditions and access this model content.
RSL-SETU-Classifier-15M — IKS Technique Classifier
A lightweight ~13M-parameter GPT-2 style classifier that identifies which of 43 Indian Knowledge System (IKS) pedagogical techniques to apply, given a student interaction context.
Designed for real-time technique selection and offline deployment on CPU — no GPU required.
Model Summary
| Property | Value |
|---|---|
| Architecture | GPT-2 encoder (bidirectional attention) + mean-pooled linear classifier head |
| Parameters | ~13.1M |
| Layers | 6 transformer blocks |
| Embedding dim | 256 |
| Attention heads | 8 |
| Context window | 512 tokens |
| Vocab | 32,000 (SentencePiece BPE) |
| Output classes | 43 |
| Model size | ~60 MB (best.pt) |
| SHA-256 | f66b19064f4704d02a57d6fda0a2f4f3010571bc0d89844358d34760e76f5471 |
| Inference device | CPU (no GPU needed) |
| License | CC BY-NC 4.0 |
What This Model Does
Given a tokenized input of {student_message + cognitive_state + subject}, the classifier outputs a probability distribution over 43 IKS pedagogical techniques. Use it for:
- Real-time technique selection — sub-10ms inference on CPU
- Offline fallback — when a full 7B LoRA model is unavailable
- Pre-filtering — narrow technique candidates before passing to the Teaching LM
The 43 IKS Techniques
Organized into 5 categories:
| Category | Techniques | Count |
|---|---|---|
| Memory (Patha) | Samhita, Pada, Krama, Jataa, Mala, Shikha, Rekha, Dhwaja, Ghana, Katapayadi | 10 |
| Vedic Math (Sutra) | Ekadhikena, Nikhilam, Urdhva Tiryak, Paravartya, Anurupyena, Shunyam, Yavadunam, Sankalana, Puranapuranabhyam, Chalana, Ekanyunena, Sesanyankena, Sopantyadvayamantyam, Gunitasamuccayah, Gunakasamuccayah, Dwandwa Yoga | 16 |
| Attention | Dhyana-based Focus Protocol, Pratyahara Learning Pause | 2 |
| Reasoning (Nyaya) | Nyaya Pramana, Pancha Avayava, Tarka Vaada, Mimamsa Vakya Vichara, Jalpa and Vitanda, Anvaya-Vyatireka, Upamana, Pratyaksha | 8 |
| Pedagogy | Guru-Shishya Dialogue, Shruti-Smriti Integration, Spaced Repetition with Vedic Intervals, Panini Pratyahara, Utsarga-Apavada, Adhyaropa-Apavada, Viveka | 7 |
Architecture
Input Text → SentencePiece Tokenizer (32K vocab)
→ Token Embeddings (256-dim) + Position Embeddings
→ 6× Transformer Blocks (8-head bidirectional self-attention, MLP, LayerNorm)
→ Global Mean Pooling (over sequence dimension)
→ Linear Classifier Head (256 → 43)
→ Softmax → Technique Probabilities
Accuracy
| Metric | Value |
|---|---|
| Eval accuracy | 99.6% (v5 bidirectional, early-stopped at step ~2300/5000) |
| Classes | 43 (all trained) |
| Attention | Bidirectional (full sequence context) |
| Training data | 5,705 train / 2,096 eval (balanced across 43 techniques) |
Training
| Parameter | Value |
|---|---|
| Version | v5 (bidirectional) |
| Training data | Curated IKS classification data (43 techniques, balanced) |
| Train / Eval split | 5,705 / 2,096 (balanced across 43 techniques) |
| Max iterations | 5,000 (early-stopped at ~2,300; patience=10) |
| Best checkpoint | ~Step 1,900 (99.6% eval accuracy) |
| Batch size | 32 |
| Optimizer | AdamW (β1=0.9, β2=0.95, weight decay 0.1) |
| Learning rate | 3e-4 → min_lr (cosine decay) |
| Warmup | 300 iterations |
| Dropout | 0.3 (attention/residual), 0.4 (classifier head) |
| Label smoothing | 0.1 |
| Hardware | NVIDIA T4 GPU (Vertex AI) |
| Tokenizer | SentencePiece 32K |
How to Use
import torch
import json
from model import IKSTechniqueClassifier
import sentencepiece as spm
# Load labels
with open("technique_labels.json") as f:
labels = json.load(f)
id_to_name = {int(k): v["name"] for k, v in labels["labels"].items()}
# Load model
config = {
"model": {
"vocab_size": 32000, "block_size": 512, "n_layer": 6,
"n_embd": 256, "n_head": 8, "dropout": 0.0,
"bias": False, "num_classes": 43, "bidirectional": True
}
}
model = IKSTechniqueClassifier(config)
ckpt = torch.load("best.pt", map_location="cpu", weights_only=True)
model.load_state_dict(ckpt["model"])
model.eval()
# Tokenize
sp = spm.SentencePieceProcessor(model_file="iks_sp_tokenizer.model")
tokens = sp.encode("Class 9 student struggling with quadratic equations, cognitive_load=0.8, subject=Mathematics", out_type=int)[:512]
input_ids = torch.tensor([tokens])
# Classify
with torch.no_grad():
logits, _ = model(input_ids)
predicted = logits.argmax(dim=-1).item()
print(f"Recommended technique: {id_to_name[predicted]}")
Intended Use
- Primary: Offline technique selector in the VIDYA adaptive tutoring system
- Secondary: Lightweight API endpoint for technique recommendation
- Research: Studying IKS technique-content alignment patterns
Limitations
- 512-token context window limits input to short student interactions
- Technique granularity at 43 classes — fine-grained sub-technique variants not distinguished
- Actual parameter count is ~13.1M (not 15M as the repo name suggests)
Citation
@model{rsl_setu_classifier_15m,
title={RSL-SETU-Classifier-15M: Lightweight IKS Technique Classifier},
author={Sivasubramani, Santhosh},
year={2026},
institution={INTRINSIC Lab, RSL Foundation, IIT Delhi},
url={https://huggingface.co/RSL-INTRINSICLab-IIT/RSL-SETU-Classifier-15M}
}
Related Resources
- RSL-SETU-LoRA-v35 — Full 7B teaching model (best-performing LoRA adapter)
- RSL-PRAJNA-v2 — Evaluation benchmark
- RSL-BHARATI-v3 — Compatible tokenizer (same 32K SentencePiece vocab)
Patent Notice
The technique classification method implemented in this model is covered by Indian Complete Patent Application No. 1536IN241 — "System and Method for Selection of Pedagogical Techniques", filed March 2026 in the name of Indian Institute of Technology Delhi.
The cognitive state estimation method used as input to this classifier is covered by Indian Complete Patent Application No. 1536IN242 — "System and Method for Non-Invasive Cognitive State Estimation Using Behavioral Interaction Patterns in Educational Software", filed March 2026.
Inventor: Prof. Santhosh Sivasubramani. The model weights and code are released under CC BY-NC 4.0 for research and educational use. The patented methods may not be used in commercial products or services without a separate licence from IIT Delhi.
License
CC BY-NC 4.0 — Free for research and educational use. Commercial use requires a license from IIT Delhi.
Acknowledgment
Demonstrated at the Bharat Bodhan AI Conclave, anchored and driven by the Ministry of Education and IIT Madras, New Delhi.
Contact
Prof. Santhosh Sivasubramani Director, INTRINSIC Laboratory RSL Foundation, Centre for SeNSE, IIT Delhi ssivasub@iitd.ac.in https://intrinsic.iitd.ac.in