You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

By requesting access you agree to the CC BY-NC 4.0 license terms. Commercial use requires a separate licence from IIT Delhi.

Log in or Sign Up to review the conditions and access this model content.

RSL-SETU-Classifier-15M — IKS Technique Classifier

A lightweight ~13M-parameter GPT-2 style classifier that identifies which of 43 Indian Knowledge System (IKS) pedagogical techniques to apply, given a student interaction context.

Designed for real-time technique selection and offline deployment on CPU — no GPU required.

Model Summary

Property Value
Architecture GPT-2 encoder (bidirectional attention) + mean-pooled linear classifier head
Parameters ~13.1M
Layers 6 transformer blocks
Embedding dim 256
Attention heads 8
Context window 512 tokens
Vocab 32,000 (SentencePiece BPE)
Output classes 43
Model size ~60 MB (best.pt)
SHA-256 f66b19064f4704d02a57d6fda0a2f4f3010571bc0d89844358d34760e76f5471
Inference device CPU (no GPU needed)
License CC BY-NC 4.0

What This Model Does

Given a tokenized input of {student_message + cognitive_state + subject}, the classifier outputs a probability distribution over 43 IKS pedagogical techniques. Use it for:

  • Real-time technique selection — sub-10ms inference on CPU
  • Offline fallback — when a full 7B LoRA model is unavailable
  • Pre-filtering — narrow technique candidates before passing to the Teaching LM

The 43 IKS Techniques

Organized into 5 categories:

Category Techniques Count
Memory (Patha) Samhita, Pada, Krama, Jataa, Mala, Shikha, Rekha, Dhwaja, Ghana, Katapayadi 10
Vedic Math (Sutra) Ekadhikena, Nikhilam, Urdhva Tiryak, Paravartya, Anurupyena, Shunyam, Yavadunam, Sankalana, Puranapuranabhyam, Chalana, Ekanyunena, Sesanyankena, Sopantyadvayamantyam, Gunitasamuccayah, Gunakasamuccayah, Dwandwa Yoga 16
Attention Dhyana-based Focus Protocol, Pratyahara Learning Pause 2
Reasoning (Nyaya) Nyaya Pramana, Pancha Avayava, Tarka Vaada, Mimamsa Vakya Vichara, Jalpa and Vitanda, Anvaya-Vyatireka, Upamana, Pratyaksha 8
Pedagogy Guru-Shishya Dialogue, Shruti-Smriti Integration, Spaced Repetition with Vedic Intervals, Panini Pratyahara, Utsarga-Apavada, Adhyaropa-Apavada, Viveka 7

Architecture

Input Text → SentencePiece Tokenizer (32K vocab)
          → Token Embeddings (256-dim) + Position Embeddings
          → 6× Transformer Blocks (8-head bidirectional self-attention, MLP, LayerNorm)
          → Global Mean Pooling (over sequence dimension)
          → Linear Classifier Head (256 → 43)
          → Softmax → Technique Probabilities

Accuracy

Metric Value
Eval accuracy 99.6% (v5 bidirectional, early-stopped at step ~2300/5000)
Classes 43 (all trained)
Attention Bidirectional (full sequence context)
Training data 5,705 train / 2,096 eval (balanced across 43 techniques)

Training

Parameter Value
Version v5 (bidirectional)
Training data Curated IKS classification data (43 techniques, balanced)
Train / Eval split 5,705 / 2,096 (balanced across 43 techniques)
Max iterations 5,000 (early-stopped at ~2,300; patience=10)
Best checkpoint ~Step 1,900 (99.6% eval accuracy)
Batch size 32
Optimizer AdamW (β1=0.9, β2=0.95, weight decay 0.1)
Learning rate 3e-4 → min_lr (cosine decay)
Warmup 300 iterations
Dropout 0.3 (attention/residual), 0.4 (classifier head)
Label smoothing 0.1
Hardware NVIDIA T4 GPU (Vertex AI)
Tokenizer SentencePiece 32K

How to Use

import torch
import json
from model import IKSTechniqueClassifier
import sentencepiece as spm

# Load labels
with open("technique_labels.json") as f:
    labels = json.load(f)
id_to_name = {int(k): v["name"] for k, v in labels["labels"].items()}

# Load model
config = {
    "model": {
        "vocab_size": 32000, "block_size": 512, "n_layer": 6,
        "n_embd": 256, "n_head": 8, "dropout": 0.0,
        "bias": False, "num_classes": 43, "bidirectional": True
    }
}
model = IKSTechniqueClassifier(config)
ckpt = torch.load("best.pt", map_location="cpu", weights_only=True)
model.load_state_dict(ckpt["model"])
model.eval()

# Tokenize
sp = spm.SentencePieceProcessor(model_file="iks_sp_tokenizer.model")
tokens = sp.encode("Class 9 student struggling with quadratic equations, cognitive_load=0.8, subject=Mathematics", out_type=int)[:512]
input_ids = torch.tensor([tokens])

# Classify
with torch.no_grad():
    logits, _ = model(input_ids)
    predicted = logits.argmax(dim=-1).item()

print(f"Recommended technique: {id_to_name[predicted]}")

Intended Use

  • Primary: Offline technique selector in the VIDYA adaptive tutoring system
  • Secondary: Lightweight API endpoint for technique recommendation
  • Research: Studying IKS technique-content alignment patterns

Limitations

  • 512-token context window limits input to short student interactions
  • Technique granularity at 43 classes — fine-grained sub-technique variants not distinguished
  • Actual parameter count is ~13.1M (not 15M as the repo name suggests)

Citation

@model{rsl_setu_classifier_15m,
  title={RSL-SETU-Classifier-15M: Lightweight IKS Technique Classifier},
  author={Sivasubramani, Santhosh},
  year={2026},
  institution={INTRINSIC Lab, RSL Foundation, IIT Delhi},
  url={https://huggingface.co/RSL-INTRINSICLab-IIT/RSL-SETU-Classifier-15M}
}

Related Resources

  • RSL-SETU-LoRA-v35 — Full 7B teaching model (best-performing LoRA adapter)
  • RSL-PRAJNA-v2 — Evaluation benchmark
  • RSL-BHARATI-v3 — Compatible tokenizer (same 32K SentencePiece vocab)

Patent Notice

The technique classification method implemented in this model is covered by Indian Complete Patent Application No. 1536IN241"System and Method for Selection of Pedagogical Techniques", filed March 2026 in the name of Indian Institute of Technology Delhi.

The cognitive state estimation method used as input to this classifier is covered by Indian Complete Patent Application No. 1536IN242"System and Method for Non-Invasive Cognitive State Estimation Using Behavioral Interaction Patterns in Educational Software", filed March 2026.

Inventor: Prof. Santhosh Sivasubramani. The model weights and code are released under CC BY-NC 4.0 for research and educational use. The patented methods may not be used in commercial products or services without a separate licence from IIT Delhi.

License

CC BY-NC 4.0 — Free for research and educational use. Commercial use requires a license from IIT Delhi.

Acknowledgment

Demonstrated at the Bharat Bodhan AI Conclave, anchored and driven by the Ministry of Education and IIT Madras, New Delhi.

Contact

Prof. Santhosh Sivasubramani Director, INTRINSIC Laboratory RSL Foundation, Centre for SeNSE, IIT Delhi ssivasub@iitd.ac.in https://intrinsic.iitd.ac.in

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support