LSTM with PyTorch & Lightning

Model Summary

A Long Short-Term Memory (LSTM) network implemented in PyTorch and trained using PyTorch Lightning for clean, scalable training loops. This model demonstrates sequence modeling β€” applicable to time series forecasting, text generation, or sequential pattern learning depending on the dataset used.


Model Details

  • Developed by: Chandrasekar Adhithya Pasumarthi (@Adhithpasu)
  • Affiliation: Frisco ISD, TX | AI Club Leader | Class of 2027
  • Model type: LSTM (Recurrent Neural Network)
  • Framework: PyTorch + PyTorch Lightning
  • License: Apache 2.0
  • Related work: Part of a broader ML portfolio spanning CNNs, regression, and NLP β€” see @Adhithpasu on GitHub

Intended Uses

Direct use:

  • Sequential data modeling (time series, text, sensor data)
  • Educational demonstration of LSTM architecture and PyTorch Lightning training patterns
  • Baseline recurrent model for comparison against Transformers and attention-based architectures

Out-of-scope use:

  • Production deployment without fine-tuning on domain-specific data
  • Long-context tasks where Transformer architectures are more suitable

Training Data

(Update with your specific dataset β€” e.g., a time series dataset, text corpus, or other sequential data)


Evaluation

Metric Value
Train Loss TBD
Val Loss TBD
Perplexity TBD

(Fill in with your actual results)


How to Use

import torch
import pytorch_lightning as pl

# Load the model checkpoint
model = LSTMModel.load_from_checkpoint("lstm_model.ckpt")
model.eval()

# Example inference β€” replace with your actual input tensor
# Shape: (batch_size, seq_len, input_size)
sample_input = torch.randn(1, 50, 1)

with torch.no_grad():
    output = model(sample_input)
    print(f"Output shape: {output.shape}")

Model Architecture

Input (seq_len, input_size)
β†’ LSTM(hidden_size=128, num_layers=2, dropout=0.2)
β†’ Linear(128, output_size)

(Update to match your actual architecture)


Why PyTorch Lightning?

PyTorch Lightning removes boilerplate from training loops β€” separating research code (model definition) from engineering code (training, logging, checkpointing). This makes the code more readable, reproducible, and scalable to multi-GPU setups without changes to the model itself.


Limitations & Bias

  • LSTMs struggle with very long sequences compared to Transformer-based models
  • Performance is highly dependent on sequence length, hidden size, and the nature of the input data
  • May require significant hyperparameter tuning for new domains

Citation

@misc{pasumarthi2026lstm,
  author    = {Chandrasekar Adhithya Pasumarthi},
  title     = {LSTM with PyTorch and Lightning},
  year      = {2026},
  publisher = {Hugging Face},
  url       = {https://huggingface.co/Chandrasekar123/LSTMPytorchandLightning}
}

Contact

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support