LSTM Recipe Next-Token Generator

Model Summary

This model is a next-token language model based on a Long Short-Term Memory (LSTM) network trained on the Epicurious Recipes dataset. It was developed as part of an academic assignment for MSAI 630 – Generative AI with LLMs and demonstrates classical sequence modeling for text generation prior to transformer-based architectures.

The model predicts the next word token given a sequence of previous tokens and can be used iteratively to generate long-form recipe-style text.

Architecture

Embedding size: 100
LSTM hidden units: 128
Vocabulary size: 10,000
Maximum sequence length: 200 tokens
Output: Softmax over vocabulary (next-token prediction)

Framework: TensorFlow / Keras

Training Details

Dataset: Epicurious Recipes (Kaggle)
Objective: Next-token prediction using sparse categorical cross-entropy
Optimizer: Adam
Epochs: 25
Batch size: 32
Random seed: 42

Text preprocessing includes:

Lowercasing
Explicit punctuation token separation
Integer tokenization via TextVectorization

The original dataset is not redistributed in this repository.

Usage

This model is intended for educational and experimental purposes.

Typical usage:

Tokenize an input prompt using the same vocabulary
Feed the token sequence to the model
Sample the next token from the output distribution
Append the token and repeat

Temperature sampling can be applied to control randomness during generation.

Limitations

LSTM-based models have limited long-range coherence
Outputs may become repetitive over long generations
Not suitable for real-world recipe advice or food safety guidance
Trained on a static dataset with no factual grounding

Intended Use

Educational demonstrations of sequence modeling
Classical NLP comparison against transformer models
Coursework and portfolio showcase

Author

Zane Graper

Downloads last month: 3