zanegraper
/

LSTM_Recipe_Generator

Text Generation

Model card Files Files and versions

zanegraper commited on Jan 29

Commit

7b20439

·

verified ·

1 Parent(s): 250c6cd

Create README.md

Files changed (1) hide show

README.md +68 -0

README.md ADDED Viewed

	@@ -0,0 +1,68 @@

+---
+language: en
+tags:
+- text-generation
+- keras
+- tensorflow
+- lstm
+- educational
+license: other
+datasets:
+- epicurious-recipes-kaggle
+---
+# LSTM Recipe Next-Token Generator
+## Model Summary
+This model is a next-token language model based on a Long Short-Term Memory (LSTM) network trained on the Epicurious Recipes dataset. It was developed as part of an academic assignment for **MSAI 630 – Generative AI with LLMs** and demonstrates classical sequence modeling for text generation prior to transformer-based architectures.
+The model predicts the next word token given a sequence of previous tokens and can be used iteratively to generate long-form recipe-style text.
+## Architecture
+- Embedding size: 100
+- LSTM hidden units: 128
+- Vocabulary size: 10,000
+- Maximum sequence length: 200 tokens
+- Output: Softmax over vocabulary (next-token prediction)
+Framework: TensorFlow / Keras
+## Training Details
+- Dataset: Epicurious Recipes (Kaggle)
+- Objective: Next-token prediction using sparse categorical cross-entropy
+- Optimizer: Adam
+- Epochs: 25
+- Batch size: 32
+- Random seed: 42
+Text preprocessing includes:
+- Lowercasing
+- Explicit punctuation token separation
+- Integer tokenization via `TextVectorization`
+The original dataset is **not redistributed** in this repository.
+## Usage
+This model is intended for educational and experimental purposes.
+Typical usage:
+1. Tokenize an input prompt using the same vocabulary
+2. Feed the token sequence to the model
+3. Sample the next token from the output distribution
+4. Append the token and repeat
+Temperature sampling can be applied to control randomness during generation.
+## Limitations
+- LSTM-based models have limited long-range coherence
+- Outputs may become repetitive over long generations
+- Not suitable for real-world recipe advice or food safety guidance
+- Trained on a static dataset with no factual grounding
+## Intended Use
+- Educational demonstrations of sequence modeling
+- Classical NLP comparison against transformer models
+- Coursework and portfolio showcase
+## Author
+Zane Graper