LSTM Recipe Next-Token Generator
Model Summary
This model is a next-token language model based on a Long Short-Term Memory (LSTM) network trained on the Epicurious Recipes dataset. It was developed as part of an academic assignment for MSAI 630 – Generative AI with LLMs and demonstrates classical sequence modeling for text generation prior to transformer-based architectures.
The model predicts the next word token given a sequence of previous tokens and can be used iteratively to generate long-form recipe-style text.
Architecture
- Embedding size: 100
- LSTM hidden units: 128
- Vocabulary size: 10,000
- Maximum sequence length: 200 tokens
- Output: Softmax over vocabulary (next-token prediction)
Framework: TensorFlow / Keras
Training Details
- Dataset: Epicurious Recipes (Kaggle)
- Objective: Next-token prediction using sparse categorical cross-entropy
- Optimizer: Adam
- Epochs: 25
- Batch size: 32
- Random seed: 42
Text preprocessing includes:
- Lowercasing
- Explicit punctuation token separation
- Integer tokenization via
TextVectorization
The original dataset is not redistributed in this repository.
Usage
This model is intended for educational and experimental purposes.
Typical usage:
- Tokenize an input prompt using the same vocabulary
- Feed the token sequence to the model
- Sample the next token from the output distribution
- Append the token and repeat
Temperature sampling can be applied to control randomness during generation.
Limitations
- LSTM-based models have limited long-range coherence
- Outputs may become repetitive over long generations
- Not suitable for real-world recipe advice or food safety guidance
- Trained on a static dataset with no factual grounding
Intended Use
- Educational demonstrations of sequence modeling
- Classical NLP comparison against transformer models
- Coursework and portfolio showcase
Author
Zane Graper
- Downloads last month
- 14