saksh-d commited on
Commit
8561165
·
verified ·
1 Parent(s): b0671c9

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +80 -61
README.md CHANGED
@@ -1,61 +1,80 @@
1
- # Recipe-GPT
2
-
3
- This project fine-tunes the pretrained [`openai-community/gpt2`](https://huggingface.co/openai-community/gpt2) model using Hugging Face’s `transformers` and `datasets` libraries on a large corpus of human-written recipes.
4
-
5
- The goal is to teach GPT-2 how to generate coherent cooking instructions from a list of ingredients.
6
-
7
- ## Key Features:
8
- - Base Model: GPT-2 (117M) from Hugging Face Model Hub
9
- - Frameworks:
10
- - `transformers` for model loading, training, and generation
11
- - `datasets` for loading and managing large recipe data
12
- - `Trainer` API for end-to-end fine-tuning
13
- - Custom Tokens:
14
- - Added `<start>` and `<end>` tokens to mark recipe boundaries
15
- - Tokenizer and model embeddings resized accordingly
16
- - Data Format:
17
- - Recipes formatted as plain text blocks with titles, ingredients, and step-by-step directions
18
- - Training Strategy:
19
- - Causal language modeling (not masked)
20
- - Evaluated on validation set each epoch
21
- - Supports CPU, CUDA, and Apple MPS backends
22
-
23
- This setup allows the model to learn full-text generation patterns and structure, making it effective at translating structured ingredient lists into complete, human-readable cooking instructions.
24
-
25
- ## Dataset
26
-
27
- Source: [`tengomucho/all-recipes-split`](https://huggingface.co/datasets/tengomucho/all-recipes-split)
28
-
29
- 2.1M+ recipes with:
30
- - `title`
31
- - `ingredients`
32
- - `directions`
33
-
34
- ---
35
-
36
- ## How to Use
37
-
38
- ### Install dependencies
39
- ```bash
40
- pip install -r requirements.txt
41
- ```
42
-
43
- ### Download dataset and tokenize
44
- ```bash
45
- python scripts/dataset.py
46
- ```
47
-
48
- ### Train the model
49
- ```bash
50
- python scripts/train.py
51
- ```
52
-
53
- ### Run Inference
54
- ```bash
55
- python scripts/inference.py
56
- ```
57
-
58
- Enter comma-separated ingredients when prompted, for example,
59
- ```bash
60
- Enter ingredients (comma-separated): chicken, rice, garlic
61
- ```
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ title: Recipe-GPT
3
+ emoji: 🍳
4
+ colorFrom: yellow
5
+ colorTo: red
6
+ sdk: gradio
7
+ sdk_version: 4.27.0
8
+ app_file: app.py
9
+ pinned: false
10
+ license: mit
11
+ tags:
12
+ - gpt2
13
+ - gradio
14
+ - transformers
15
+ - recipe-generation
16
+ - huggingface
17
+ short_description: 'Turn ingredients into step-by-step cooking directions. '
18
+ ---
19
+
20
+ # Recipe-GPT
21
+
22
+ This project fine-tunes the pretrained [`openai-community/gpt2`](https://huggingface.co/openai-community/gpt2) model using Hugging Face’s `transformers` and `datasets` libraries on a large corpus of human-written recipes.
23
+
24
+ The goal is to teach GPT-2 how to generate coherent cooking instructions from a list of ingredients.
25
+
26
+ ## Key Features:
27
+ - Base Model: GPT-2 (117M) from Hugging Face Model Hub
28
+ - Frameworks:
29
+ - `transformers` for model loading, training, and generation
30
+ - `datasets` for loading and managing large recipe data
31
+ - `Trainer` API for end-to-end fine-tuning
32
+ - Custom Tokens:
33
+ - Added `<start>` and `<end>` tokens to mark recipe boundaries
34
+ - Tokenizer and model embeddings resized accordingly
35
+ - Data Format:
36
+ - Recipes formatted as plain text blocks with titles, ingredients, and step-by-step directions
37
+ - Training Strategy:
38
+ - Causal language modeling (not masked)
39
+ - Evaluated on validation set each epoch
40
+ - Supports CPU, CUDA, and Apple MPS backends
41
+
42
+ This setup allows the model to learn full-text generation patterns and structure, making it effective at translating structured ingredient lists into complete, human-readable cooking instructions.
43
+
44
+ ## Dataset
45
+
46
+ Source: [`tengomucho/all-recipes-split`](https://huggingface.co/datasets/tengomucho/all-recipes-split)
47
+
48
+ 2.1M+ recipes with:
49
+ - `title`
50
+ - `ingredients`
51
+ - `directions`
52
+
53
+ ---
54
+
55
+ ## How to Use
56
+
57
+ ### Install dependencies
58
+ ```bash
59
+ pip install -r requirements.txt
60
+ ```
61
+
62
+ ### Download dataset and tokenize
63
+ ```bash
64
+ python scripts/dataset.py
65
+ ```
66
+
67
+ ### Train the model
68
+ ```bash
69
+ python scripts/train.py
70
+ ```
71
+
72
+ ### Run Inference
73
+ ```bash
74
+ python scripts/inference.py
75
+ ```
76
+
77
+ Enter comma-separated ingredients when prompted, for example,
78
+ ```bash
79
+ Enter ingredients (comma-separated): chicken, rice, garlic
80
+ ```