zanegraper
/

LSTM_Recipe_Generator

Text Generation

Model card Files Files and versions

LSTM_Recipe_Generator / README.md

zanegraper's picture

Create README.md

7b20439 verified 6 days ago

|

history blame contribute delete

2.07 kB

	---
	language: en
	tags:
	- text-generation
	- keras
	- tensorflow
	- lstm
	- educational
	license: other
	datasets:
	- epicurious-recipes-kaggle
	---

	# LSTM Recipe Next-Token Generator

	## Model Summary
	This model is a next-token language model based on a Long Short-Term Memory (LSTM) network trained on the Epicurious Recipes dataset. It was developed as part of an academic assignment for MSAI 630 – Generative AI with LLMs and demonstrates classical sequence modeling for text generation prior to transformer-based architectures.

	The model predicts the next word token given a sequence of previous tokens and can be used iteratively to generate long-form recipe-style text.

	## Architecture
	- Embedding size: 100
	- LSTM hidden units: 128
	- Vocabulary size: 10,000
	- Maximum sequence length: 200 tokens
	- Output: Softmax over vocabulary (next-token prediction)

	Framework: TensorFlow / Keras

	## Training Details
	- Dataset: Epicurious Recipes (Kaggle)
	- Objective: Next-token prediction using sparse categorical cross-entropy
	- Optimizer: Adam
	- Epochs: 25
	- Batch size: 32
	- Random seed: 42

	Text preprocessing includes:
	- Lowercasing
	- Explicit punctuation token separation
	- Integer tokenization via `TextVectorization`

	The original dataset is not redistributed in this repository.

	## Usage
	This model is intended for educational and experimental purposes.

	Typical usage:
	1. Tokenize an input prompt using the same vocabulary
	2. Feed the token sequence to the model
	3. Sample the next token from the output distribution
	4. Append the token and repeat

	Temperature sampling can be applied to control randomness during generation.

	## Limitations
	- LSTM-based models have limited long-range coherence
	- Outputs may become repetitive over long generations
	- Not suitable for real-world recipe advice or food safety guidance
	- Trained on a static dataset with no factual grounding

	## Intended Use
	- Educational demonstrations of sequence modeling
	- Classical NLP comparison against transformer models
	- Coursework and portfolio showcase

	## Author
	Zane Graper