LSTM with PyTorch & Lightning
Model Summary
A Long Short-Term Memory (LSTM) network implemented in PyTorch and trained using PyTorch Lightning for clean, scalable training loops. This model demonstrates sequence modeling β applicable to time series forecasting, text generation, or sequential pattern learning depending on the dataset used.
Model Details
- Developed by: Chandrasekar Adhithya Pasumarthi (@Adhithpasu)
- Affiliation: Frisco ISD, TX | AI Club Leader | Class of 2027
- Model type: LSTM (Recurrent Neural Network)
- Framework: PyTorch + PyTorch Lightning
- License: Apache 2.0
- Related work: Part of a broader ML portfolio spanning CNNs, regression, and NLP β see @Adhithpasu on GitHub
Intended Uses
Direct use:
- Sequential data modeling (time series, text, sensor data)
- Educational demonstration of LSTM architecture and PyTorch Lightning training patterns
- Baseline recurrent model for comparison against Transformers and attention-based architectures
Out-of-scope use:
- Production deployment without fine-tuning on domain-specific data
- Long-context tasks where Transformer architectures are more suitable
Training Data
(Update with your specific dataset β e.g., a time series dataset, text corpus, or other sequential data)
Evaluation
| Metric | Value |
|---|---|
| Train Loss | TBD |
| Val Loss | TBD |
| Perplexity | TBD |
(Fill in with your actual results)
How to Use
import torch
import pytorch_lightning as pl
# Load the model checkpoint
model = LSTMModel.load_from_checkpoint("lstm_model.ckpt")
model.eval()
# Example inference β replace with your actual input tensor
# Shape: (batch_size, seq_len, input_size)
sample_input = torch.randn(1, 50, 1)
with torch.no_grad():
output = model(sample_input)
print(f"Output shape: {output.shape}")
Model Architecture
Input (seq_len, input_size)
β LSTM(hidden_size=128, num_layers=2, dropout=0.2)
β Linear(128, output_size)
(Update to match your actual architecture)
Why PyTorch Lightning?
PyTorch Lightning removes boilerplate from training loops β separating research code (model definition) from engineering code (training, logging, checkpointing). This makes the code more readable, reproducible, and scalable to multi-GPU setups without changes to the model itself.
Limitations & Bias
- LSTMs struggle with very long sequences compared to Transformer-based models
- Performance is highly dependent on sequence length, hidden size, and the nature of the input data
- May require significant hyperparameter tuning for new domains
Citation
@misc{pasumarthi2026lstm,
author = {Chandrasekar Adhithya Pasumarthi},
title = {LSTM with PyTorch and Lightning},
year = {2026},
publisher = {Hugging Face},
url = {https://huggingface.co/Chandrasekar123/LSTMPytorchandLightning}
}
Contact
- GitHub: @Adhithpasu