| --- |
| language: |
| - en |
| license: apache-2.0 |
| tags: |
| - time-series |
| - sequence-modeling |
| - lstm |
| - pytorch |
| - pytorch-lightning |
| - text-generation |
| pipeline_tag: text-generation |
| metrics: |
| - loss |
| - perplexity |
| --- |
| |
| # LSTM with PyTorch & Lightning |
|
|
| ## Model Summary |
|
|
| A Long Short-Term Memory (LSTM) network implemented in PyTorch and trained using PyTorch Lightning for clean, scalable training loops. This model demonstrates sequence modeling β applicable to time series forecasting, text generation, or sequential pattern learning depending on the dataset used. |
|
|
| --- |
|
|
| ## Model Details |
|
|
| - **Developed by:** Chandrasekar Adhithya Pasumarthi ([@Adhithpasu](https://github.com/Adhithpasu)) |
| - **Affiliation:** Frisco ISD, TX | AI Club Leader | Class of 2027 |
| - **Model type:** LSTM (Recurrent Neural Network) |
| - **Framework:** PyTorch + PyTorch Lightning |
| - **License:** Apache 2.0 |
| - **Related work:** Part of a broader ML portfolio spanning CNNs, regression, and NLP β see [@Adhithpasu on GitHub](https://github.com/Adhithpasu) |
|
|
| --- |
|
|
| ## Intended Uses |
|
|
| **Direct use:** |
| - Sequential data modeling (time series, text, sensor data) |
| - Educational demonstration of LSTM architecture and PyTorch Lightning training patterns |
| - Baseline recurrent model for comparison against Transformers and attention-based architectures |
|
|
| **Out-of-scope use:** |
| - Production deployment without fine-tuning on domain-specific data |
| - Long-context tasks where Transformer architectures are more suitable |
|
|
| --- |
|
|
| ## Training Data |
|
|
| *(Update with your specific dataset β e.g., a time series dataset, text corpus, or other sequential data)* |
|
|
| --- |
|
|
| ## Evaluation |
|
|
| | Metric | Value | |
| |------------|-------| |
| | Train Loss | TBD | |
| | Val Loss | TBD | |
| | Perplexity | TBD | |
|
|
| *(Fill in with your actual results)* |
|
|
| --- |
|
|
| ## How to Use |
|
|
| ```python |
| import torch |
| import pytorch_lightning as pl |
| |
| # Load the model checkpoint |
| model = LSTMModel.load_from_checkpoint("lstm_model.ckpt") |
| model.eval() |
| |
| # Example inference β replace with your actual input tensor |
| # Shape: (batch_size, seq_len, input_size) |
| sample_input = torch.randn(1, 50, 1) |
| |
| with torch.no_grad(): |
| output = model(sample_input) |
| print(f"Output shape: {output.shape}") |
| ``` |
|
|
| --- |
|
|
| ## Model Architecture |
|
|
| ``` |
| Input (seq_len, input_size) |
| β LSTM(hidden_size=128, num_layers=2, dropout=0.2) |
| β Linear(128, output_size) |
| ``` |
|
|
| *(Update to match your actual architecture)* |
|
|
| --- |
|
|
| ## Why PyTorch Lightning? |
|
|
| PyTorch Lightning removes boilerplate from training loops β separating research code (model definition) from engineering code (training, logging, checkpointing). This makes the code more readable, reproducible, and scalable to multi-GPU setups without changes to the model itself. |
|
|
| --- |
|
|
| ## Limitations & Bias |
|
|
| - LSTMs struggle with very long sequences compared to Transformer-based models |
| - Performance is highly dependent on sequence length, hidden size, and the nature of the input data |
| - May require significant hyperparameter tuning for new domains |
|
|
| --- |
|
|
| ## Citation |
|
|
| ```bibtex |
| @misc{pasumarthi2026lstm, |
| author = {Chandrasekar Adhithya Pasumarthi}, |
| title = {LSTM with PyTorch and Lightning}, |
| year = {2026}, |
| publisher = {Hugging Face}, |
| url = {https://huggingface.co/Chandrasekar123/LSTMPytorchandLightning} |
| } |
| ``` |
|
|
| --- |
|
|
| ## Contact |
|
|
| - GitHub: [@Adhithpasu](https://github.com/Adhithpasu) |