LSTM Language Model — Penn Treebank

A 2-layer LSTM language model trained from scratch on Penn Treebank. Released as the real-pretrained-weights demo for xaitalk's cross-framework XAI on RNN / sequence-modeling architectures.

This is not a state-of-the-art language model. It's a clean reference implementation small enough to train in minutes, large enough that per-token attribution maps are visually meaningful, and trained on a standard benchmark dataset so users have a clear mental model of what it does.

Files

File	Format	Size
`lstm_ptb.pt`	PyTorch state_dict	~16 MB

Architecture

Property	Value
Embed dim	256
Hidden dim	256
Layers	2
Vocab	10000 (PTB)
Output	Next-token logits

Training: standard PTB language-model recipe — Adam optimizer, BPTT length 35. Weights are re-used across the three frameworks for the RNN cross-framework XAI test in xaitalk.

Cross-framework verification

These weights are validated by xaitalk's rnn benchmark (LSTM, Penn Treebank, 20 methods):

Methods	Passing at r ≥ 0.95	Min(min_r)	Verified
20	20/20	0.9965	2026-05-09

Includes gradient family, LRP-LSTM (Arras et al. 2017) variants, DeepLIFT, smoothgrad family. Stochastic methods use shared numpy noise across PT/TF/JAX for reproducibility.

Usage

from xaitalk.hub import ensure_model
import torch

ckpt_path = ensure_model('rnn/lstm-ptb')

# Architecture class lives in xaitalk
from xaitalk.models import LSTMLanguageModel
model = LSTMLanguageModel(vocab_size=10000, embed_dim=256,
                           hidden_dim=256, num_layers=2)
model.load_state_dict(torch.load(ckpt_path, weights_only=True))
model.eval()

# Run XAI
import xaitalk
# tokenized PTB sequence (B, T)
expl = xaitalk.explain(model, x, method='lrp_epsilon')

Training data

Penn Treebank (PTB) — character-level / word-level language modeling benchmark (Mikolov 2010). Standard preprocessing: ~929K tokens train, ~73K valid, ~82K test, vocab 10000.

License

Apache 2.0.

Citation

PTB benchmark setup (Zaremba 2014 LSTM recipe):

@misc{zaremba2014lstm,
  author = {Zaremba, Wojciech and Sutskever, Ilya and Vinyals, Oriol},
  title  = {Recurrent Neural Network Regularization},
  year   = {2014},
  eprint = {1409.2329}
}

LRP for LSTM (Arras et al. 2017) — the canonical XAI method for this architecture, validated cross-framework on this checkpoint:

@inproceedings{arras2017lstm,
  author    = {Arras, Leila and Montavon, Grégoire and Müller, Klaus-Robert
               and Samek, Wojciech},
  title     = {Explaining Recurrent Neural Network Predictions in Sentiment
               Analysis},
  booktitle = {EMNLP Workshop on Computational Approaches to Subjectivity,
               Sentiment and Social Media Analysis (WASSA)},
  year      = {2017}
}

xaitalk infrastructure:

@software{paul2026xaitalk,
  author = {Paul, Alexander},
  title  = {xaitalk: Cross-Framework Explainable AI Library},
  year   = {2026},
  url    = {https://xaitalk.com}
}

Dataset used to train xaitalk/rnn-lstm-ptb

Paper for xaitalk/rnn-lstm-ptb

Recurrent Neural Network Regularization

Paper • 1409.2329 • Published Sep 8, 2014 • 1

xaitalk
/

rnn-lstm-ptb