You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Log in or Sign Up to review the conditions and access this model content.

VyasVani: Sanskrit LSTM Language Model

Hosted on Hugging Face Hub: https://huggingface.co/Henil1/VyasVani

Model Description

VyasVani is a character-level Sanskrit text generation model based on a 4-layer LSTM. It was trained on a corpus of classical Sanskrit ślokas (verses) to learn patterns of Sanskrit meter and grammar. Ideal for generating coherent Sanskrit text and poetic verses.

  • Architecture: 4-layer LSTM
  • Embedding Size: 128
  • Hidden Size: 256
  • Vocabulary Size: Auto-detected from the training corpus
  • Total Parameters: 2,147,907 (all trainable)

Usage

import json
import torch
from huggingface_hub import hf_hub_download
from example_inference import SanskritLSTM, generate_text

# Download files from the Hub
model_path = hf_hub_download(repo_id="Henil1/VyasVani", filename="pytorch_model.bin")
config_path = hf_hub_download(repo_id="Henil1/VyasVani", filename="config.json")
stoi_path   = hf_hub_download(repo_id="Henil1/VyasVani", filename="stoi.json")
itos_path   = hf_hub_download(repo_id="Henil1/VyasVani", filename="itos.json")

# Load configuration and tokenizer
config = json.load(open(config_path))
stoi   = json.load(open(stoi_path))
itos   = json.load(open(itos_path))

# Rebuild the model
model = SanskritLSTM(
    vocab_size=config["vocab_size"],
    embed_size=config.get("embed_size",128),
    hidden_size=config.get("hidden_size",256),
    num_layers=config.get("num_layers",4)
)
model.load_state_dict(torch.load(model_path, map_location="cpu"))
model.eval()

# Generate Sanskrit text
seed = "श्री"
generated = generate_text(model, seed, length=200, temperature=0.8, top_k=5)
print(generated)

Example Inference Script

See example_inference.py in the repository for a ready-to-run script defining SanskritLSTM and generate_text functions.

Evaluation Metrics

  • Perplexity: 5.116

Training Details

  • Dataset: Classical Sanskrit ślokas from various texts
  • Training Steps: 1 Epoch
  • Optimizer: Adam (lr=1e-3)
  • Batch Size: 64
  • Sequence Length: 128

Citation

If you use VyasVani in your work, please cite:

@misc{VyasVani2025,
  title={VyasVani: A 4-Layer LSTM for Sanskrit Text Generation},
  author={Henilsinh Raj},
  year={2025},
  howpublished={\url{https://huggingface.co/Henil1/VyasVani}}
}

License

This model is released under the MIT License. See LICENSE for details.

Downloads last month
-
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support