LSTM Language Model — Penn Treebank
A 2-layer LSTM language model trained from scratch on Penn Treebank. Released as the real-pretrained-weights demo for xaitalk's cross-framework XAI on RNN / sequence-modeling architectures.
This is not a state-of-the-art language model. It's a clean reference implementation small enough to train in minutes, large enough that per-token attribution maps are visually meaningful, and trained on a standard benchmark dataset so users have a clear mental model of what it does.
Files
| File | Format | Size |
|---|---|---|
lstm_ptb.pt |
PyTorch state_dict | ~16 MB |
Architecture
| Property | Value |
|---|---|
| Embed dim | 256 |
| Hidden dim | 256 |
| Layers | 2 |
| Vocab | 10000 (PTB) |
| Output | Next-token logits |
Training: standard PTB language-model recipe — Adam optimizer, BPTT length 35. Weights are re-used across the three frameworks for the RNN cross-framework XAI test in xaitalk.
Cross-framework verification
These weights are validated by xaitalk's rnn benchmark
(LSTM, Penn Treebank, 20 methods):
| Methods | Passing at r ≥ 0.95 | Min(min_r) | Verified |
|---|---|---|---|
| 20 | 20/20 | 0.9965 | 2026-05-09 |
Includes gradient family, LRP-LSTM (Arras et al. 2017) variants, DeepLIFT, smoothgrad family. Stochastic methods use shared numpy noise across PT/TF/JAX for reproducibility.
Usage
from xaitalk.hub import ensure_model
import torch
ckpt_path = ensure_model('rnn/lstm-ptb')
# Architecture class lives in xaitalk
from xaitalk.models import LSTMLanguageModel
model = LSTMLanguageModel(vocab_size=10000, embed_dim=256,
hidden_dim=256, num_layers=2)
model.load_state_dict(torch.load(ckpt_path, weights_only=True))
model.eval()
# Run XAI
import xaitalk
# tokenized PTB sequence (B, T)
expl = xaitalk.explain(model, x, method='lrp_epsilon')
Training data
Penn Treebank (PTB) — character-level / word-level language modeling benchmark (Mikolov 2010). Standard preprocessing: ~929K tokens train, ~73K valid, ~82K test, vocab 10000.
License
Apache 2.0.
Citation
PTB benchmark setup (Zaremba 2014 LSTM recipe):
@misc{zaremba2014lstm,
author = {Zaremba, Wojciech and Sutskever, Ilya and Vinyals, Oriol},
title = {Recurrent Neural Network Regularization},
year = {2014},
eprint = {1409.2329}
}
LRP for LSTM (Arras et al. 2017) — the canonical XAI method for this architecture, validated cross-framework on this checkpoint:
@inproceedings{arras2017lstm,
author = {Arras, Leila and Montavon, Grégoire and Müller, Klaus-Robert
and Samek, Wojciech},
title = {Explaining Recurrent Neural Network Predictions in Sentiment
Analysis},
booktitle = {EMNLP Workshop on Computational Approaches to Subjectivity,
Sentiment and Social Media Analysis (WASSA)},
year = {2017}
}
xaitalk infrastructure:
@software{paul2026xaitalk,
author = {Paul, Alexander},
title = {xaitalk: Cross-Framework Explainable AI Library},
year = {2026},
url = {https://xaitalk.com}
}
Links
- xaitalk website: https://xaitalk.com
- Framework GitHub: https://github.com/alexanderfpaul/xaitalk-framework
- RNN comparison script:
examples/comparison/run_rnn_3framework_comparison.py