SanskritBERT (Light)

SanskritBERT is a lightweight Transformer model trained specifically for the Sanskrit language. It is based on the BERT architecture and trained using the Masked Language Modeling (MLM) objective.

Model Description

  • Shared by: Tanuj Saxena and Soumya Sharma
  • Model type: Transformers Encoder (BERT-like)
  • Language: Sanskrit
  • License: Apache 2.0
  • Finetuned from model: None (Trained from scratch)

Model Architecture

  • Layers: 6
  • Hidden Size: 256
  • Attention Heads: 4
  • Feedforward Size: 1024
  • Max Sequence Length: 512
  • Vocab Size: 64,000
  • Parameters: ~15M

Intended Uses & Limitations

Intended Uses

  • Masked Word Prediction
  • Fine-tuning for Sanskrit NLP tasks involves (POS Tagging, NER, Text Classification)
  • Research into low-resource language modeling

Limitations

  • The model is "Light", so it may not capture as much nuance as a bert-base or bert-large model.
  • Performance depends heavily on the domain of the downstream task relative to the pre-training corpus.

Training Data

Trained on a corpus of Sanskrit texts including general literature, wikis, and classical texts.

Training Procedure

  • Optimizer: AdamW
  • Precision: Mixed Precision (bf16)
  • Batch Size: 16
  • Epochs: 6

How to Get Started

You can use the model directly with the Hugging Face transformers library:

from transformers import AutoTokenizer, AutoModelForMaskedLM

tokenizer = AutoTokenizer.from_pretrained("tanuj437/SanskritBERT")
model = AutoModelForMaskedLM.from_pretrained("tanuj437/SanskritBERT")

text = "सत्यमेव जयते [MASK]"
inputs = tokenizer(text, return_tensors="pt")
outputs = model(**inputs)

Citation

@misc{sanskritbert2024,
  title={SanskritBERT: A Light Transformer Model for Sanskrit},
  author={[Tanuj Saxena,Soumya Sharma,Kusum Lata]},
  year={2026},
  publisher={Hugging Face}
}
Downloads last month
35
Safetensors
Model size
35.8M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for tanuj437/SanskritBERT

Finetunes
1 model