RichardErkhov commited on
Commit
09d7cec
·
verified ·
1 Parent(s): 966ba61

uploaded readme

Browse files
Files changed (1) hide show
  1. README.md +90 -0
README.md ADDED
@@ -0,0 +1,90 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ Quantization made by Richard Erkhov.
2
+
3
+ [Github](https://github.com/RichardErkhov)
4
+
5
+ [Discord](https://discord.gg/pvy7H8DZMG)
6
+
7
+ [Request more models](https://github.com/RichardErkhov/quant_request)
8
+
9
+
10
+ chef-gpt-base - bnb 8bits
11
+ - Model creator: https://huggingface.co/auhide/
12
+ - Original model: https://huggingface.co/auhide/chef-gpt-base/
13
+
14
+
15
+
16
+
17
+ Original model description:
18
+ ---
19
+ license: mit
20
+ model-index:
21
+ - name: chef-gpt-base
22
+ results: []
23
+ language:
24
+ - bg
25
+ pipeline_tag: text-generation
26
+ widget:
27
+ - text: "[ING]1 картоф[REC]"
28
+ - text: "[ING]4 бр. яйца[EOL]1 кофичка кисело мляко[EOL]1/4 ч.л. сода[REC]"
29
+ ---
30
+
31
+ # chef-gpt-base
32
+ GPT-2 architecture trained to generate recipes based on ingredients. [Visit website](https://chef-gpt.streamlit.app/).
33
+
34
+ ## Model description
35
+ This is GPT-2 pretrained on a custom dataset of recipies in Bulgarian.
36
+ You can find the dataset [here](https://www.kaggle.com/datasets/auhide/bulgarian-recipes-dataset).
37
+
38
+ ## Usage
39
+ ```python
40
+ import re
41
+ # Using this library to beautifully print the long recipe string.
42
+ from pprint import pprint
43
+
44
+ from transformers import AutoModelForCausalLM, AutoTokenizer
45
+
46
+
47
+ # Load the model and tokenizer:
48
+ MODEL_ID = "auhide/chef-gpt-base"
49
+ tokenizer = AutoTokenizer.from_pretrained(MODEL_ID)
50
+ chef_gpt = AutoModelForCausalLM.from_pretrained(MODEL_ID)
51
+
52
+ # Prepare the input:
53
+ ingredients = [
54
+ "1 ч.ч. брашно",
55
+ "4 яйца",
56
+ "1 кофичка кисело мляко",
57
+ "1/4 ч.л. сода",
58
+ ]
59
+ input_text = f"[ING]{'[EOL]'.join(ingredients)}[REC]"
60
+ input_ids = tokenizer(input_text, return_tensors="pt").input_ids
61
+
62
+ # Generate text:
63
+ output = chef_gpt.generate(input_ids, max_length=150)
64
+ recipe = tokenizer.batch_decode(output)[0]
65
+ # Get the generated recipe - it is up until the 1st [SEP] token.
66
+ recipe = re.findall(r"\[REC\](.+?)\[SEP\]", recipe)[0]
67
+
68
+ print("Съставки/Ingredients:")
69
+ pprint(ingredients)
70
+ print("\nРецепта/Recipe:")
71
+ pprint(recipe)
72
+ ```
73
+ ```bash
74
+ Съставки/Ingredients:
75
+ ['1 ч.ч. брашно', '4 яйца', '1 кофичка кисело мляко', '1/4 ч.л. сода']
76
+
77
+ Рецепта/Recipe:
78
+ ('В дълбока купа се разбиват яйцата. Добавя се киселото мляко, в което '
79
+ 'предварително е сложена содата, и се разбива. Добавя се брашното и се омесва '
80
+ 'тесто. Ако е много гъсто се добавя още малко брашно, ако е много гъсто се '
81
+ 'добавя още малко брашно. Фурната се загрява предварително на 180С градуса. '
82
+ 'Когато тестото е готово, се вади от фурната и се разделя на три части.')
83
+ ```
84
+
85
+
86
+ ## Additional tokens
87
+ - [ING] - ingredients token; denotes the begining of the tokens representing the ingredients
88
+ - [EOL] - end-of-line token; equivalent to a newline
89
+ - [REC] - recipe token; denotes the begining of the recipe
90
+