Karaka Attention

Semantically Typed Attention via Sanskrit Grammatical Role Relations

Rahul Baxi | VyasaLabs

Overview

Karaka Attention replaces standard multi-head attention with 6 semantically typed heads grounded in Pāṇini's kāraka role system (Aṣṭādhyāyī, c. 400 BCE):

Head	Sanskrit	Role
Kartā	कर्ता	Agent
Karma	कर्म	Patient
Karaṇa	करण	Instrument
Sampradāna	सम्प्रदान	Recipient
Apādāna	अपादान	Source
Adhikaraṇa	अधिकरण	Locus

Each head is conditioned on the resonant component (v_r) from the Dhvani encoder, ensuring role assignments are grounded in compression-invariant semantic representations.

Key Results

Metric	Karaka Attention	Standard MHA
Attention entropy	2.72 (informative)	0.18 (collapsed)
Paraphrase JSD	0.090 (content-following)	0.002 (position-locked)
Role diversification	0.66 cosine sim (from 0.87 initial)	—
CDCT mean	0.179	—
Forward pass	57.3ms	—

Standard MHA heads collapse to near-zero entropy (positionally rigid). Karaka heads operate in the informative range, actively tracking semantic roles across surface variations.

Architecture

Base: Qwen3-1.7B + LoRA (pretrained with Dhvani compression-invariance objective)
Karaka layers: 2 × KarakaBlock (KarakaAttention + FFN)
Bias init: 27,412 sentences from UD Sanskrit-Vedic + UFAL treebanks
Training: Sanskrit Wikipedia + treebank text, 10K steps, TPU v6e-1

Files

karaka_attention.py — Core KarakaAttention module (6 typed heads + role consistency loss)
karaka_model.py — Full model (encoder + Dhvani projection + Karaka layers)
karaka_bias_init.py — Sanskrit treebank bias initialization
train_karaka.py — Training script (TPU/XLA)
results.json — Training results
eval_results.json — Paraphrase stability + head entropy + speed
baseline_jsd.json — Standard MHA baseline JSD
baseline_entropy.json — Standard MHA baseline entropy

Citation

@article{baxi2026karaka,
  title={Karaka Attention: Semantically Typed Attention via Sanskrit Grammatical Role Relations},
  author={Baxi, Rahul},
  year={2026},
  note={VyasaLabs Technical Report}
}

Dhvani: Compression-Invariant Semantic Representations
AGT: Action-Gating Test (Springer AI & Ethics)

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Paper for rb512/karaka-attention

Separating Constraint Compliance from Semantic Accuracy: A Novel Benchmark for Evaluating Instruction-Following Under Compression

Paper • 2512.17920 • Published Dec 2, 2025

rb512
/

karaka-attention