How to use from the
Use from the
Transformers library
# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("fill-mask", model="nihilisticneuralnet/HinDiffusionLM")
# Load model directly
from transformers import AutoTokenizer, AutoModelForMaskedLM

tokenizer = AutoTokenizer.from_pretrained("nihilisticneuralnet/HinDiffusionLM")
model = AutoModelForMaskedLM.from_pretrained("nihilisticneuralnet/HinDiffusionLM")
Quick Links

HinDiffusionLM: Diffusion Language Model for Hindi Language

Turning BERT-based model into an instruct-tuned LLADA-style Diffusion LLM on Hindi instruction data using a masked language modeling approach with diffusion-style generation. The model learns to iteratively denoise masked tokens to generate coherent responses in Hindi (trained on Kaggle GPU T4*2).

Experiments

Models Evaluated

Model Performance
google/muril-base-cased Best
google/muril-large-cased Poor
ai4bharat/indic-bert Moderate

Datasets Tested

Dataset Subset Status Notes
ai4bharat/indic-instruct-data-v0.1 anudesh Used Primary dataset for demonstration
ai4bharat/indic-instruct-data-v0.1 lm_sys Skipped Too time-intensive for training & GPU constraints
Downloads last month
1
Safetensors
Model size
0.2B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support