zanegraper commited on
Commit
7b20439
·
verified ·
1 Parent(s): 250c6cd

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +68 -0
README.md ADDED
@@ -0,0 +1,68 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language: en
3
+ tags:
4
+ - text-generation
5
+ - keras
6
+ - tensorflow
7
+ - lstm
8
+ - educational
9
+ license: other
10
+ datasets:
11
+ - epicurious-recipes-kaggle
12
+ ---
13
+
14
+ # LSTM Recipe Next-Token Generator
15
+
16
+ ## Model Summary
17
+ This model is a next-token language model based on a Long Short-Term Memory (LSTM) network trained on the Epicurious Recipes dataset. It was developed as part of an academic assignment for **MSAI 630 – Generative AI with LLMs** and demonstrates classical sequence modeling for text generation prior to transformer-based architectures.
18
+
19
+ The model predicts the next word token given a sequence of previous tokens and can be used iteratively to generate long-form recipe-style text.
20
+
21
+ ## Architecture
22
+ - Embedding size: 100
23
+ - LSTM hidden units: 128
24
+ - Vocabulary size: 10,000
25
+ - Maximum sequence length: 200 tokens
26
+ - Output: Softmax over vocabulary (next-token prediction)
27
+
28
+ Framework: TensorFlow / Keras
29
+
30
+ ## Training Details
31
+ - Dataset: Epicurious Recipes (Kaggle)
32
+ - Objective: Next-token prediction using sparse categorical cross-entropy
33
+ - Optimizer: Adam
34
+ - Epochs: 25
35
+ - Batch size: 32
36
+ - Random seed: 42
37
+
38
+ Text preprocessing includes:
39
+ - Lowercasing
40
+ - Explicit punctuation token separation
41
+ - Integer tokenization via `TextVectorization`
42
+
43
+ The original dataset is **not redistributed** in this repository.
44
+
45
+ ## Usage
46
+ This model is intended for educational and experimental purposes.
47
+
48
+ Typical usage:
49
+ 1. Tokenize an input prompt using the same vocabulary
50
+ 2. Feed the token sequence to the model
51
+ 3. Sample the next token from the output distribution
52
+ 4. Append the token and repeat
53
+
54
+ Temperature sampling can be applied to control randomness during generation.
55
+
56
+ ## Limitations
57
+ - LSTM-based models have limited long-range coherence
58
+ - Outputs may become repetitive over long generations
59
+ - Not suitable for real-world recipe advice or food safety guidance
60
+ - Trained on a static dataset with no factual grounding
61
+
62
+ ## Intended Use
63
+ - Educational demonstrations of sequence modeling
64
+ - Classical NLP comparison against transformer models
65
+ - Coursework and portfolio showcase
66
+
67
+ ## Author
68
+ Zane Graper