anquachdev commited on
Commit
c92bfda
·
verified ·
1 Parent(s): d36d389

Uploading FoodExtract demo app.py

Browse files
Files changed (3) hide show
  1. README.md +44 -6
  2. app.py +89 -0
  3. requirements.txt +4 -0
README.md CHANGED
@@ -1,12 +1,50 @@
1
  ---
2
- title: FoodExtract V1
3
- emoji:
4
- colorFrom: blue
5
- colorTo: green
6
  sdk: gradio
7
- sdk_version: 6.10.0
8
  app_file: app.py
9
  pinned: false
 
10
  ---
11
 
12
- Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ title: FoodExtract Fine-tuned LLM Structued Data Extractor v1
3
+ emoji: 📝➡️🍟
4
+ colorFrom: green
5
+ colorTo: blue
6
  sdk: gradio
 
7
  app_file: app.py
8
  pinned: false
9
+ license: apache-2.0
10
  ---
11
 
12
+ """
13
+ Fine-tuned Gemma 3 270M to extract food and drink items from raw text.
14
+
15
+ Input can be any form of real text (mostly focused on shorter image caption-like texts):
16
+
17
+ ```
18
+ A truly eclectic and mouth-watering feast is laid out on the table, featuring savory favorites like crispy fried chicken,
19
+ a perfectly seared steak, and loaded tacos, complete with a side of creamy mayonnaise. To balance the heavier mains,
20
+ a vibrant assortment of fresh fruit sits nearby, including a crisp red apple, a tropical pineapple, and a scattering of
21
+ sweet cherries. Thirst-quenching options complete this extravagant spread, with a classic iced latte, an earthy matcha latte,
22
+ and a simple, refreshing glass of milk ready to be enjoyed.
23
+ ```
24
+
25
+ And output will be a formatted string such as the following:
26
+
27
+ ```
28
+ food_or_drink: 1
29
+ tags: fi, re
30
+ foods: tacos,red apple, pineapple, cherries, fried chicken, steak, mayonnaise
31
+ drinks: iced latte, matcha latte, milk
32
+ ```
33
+
34
+ The tags map to the following items:
35
+
36
+ ```
37
+ tags_dict = {'np': 'nutrition_panel',
38
+ 'il': 'ingredient list',
39
+ 'me': 'menu',
40
+ 're': 'recipe',
41
+ 'fi': 'food_items',
42
+ 'di': 'drink_items',
43
+ 'fa': 'food_advertistment',
44
+ 'fp': 'food_packaging'}
45
+ ```
46
+
47
+ * You can see walkthrough step by step code details at: https://www.learnhuggingface.com/notebooks/hugging_face_llm_full_fine_tune_tutorial
48
+ * See the fine-tuning dataset: https://huggingface.co/datasets/mrdbourke/FoodExtract-1k
49
+ * See the fine-tuned model: https://huggingface.co/mrdbourke/FoodExtract-gemma-3-270m-fine-tune-v1
50
+ """
app.py ADDED
@@ -0,0 +1,89 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+
2
+ # Load dependencies
3
+ import time
4
+ import transformers
5
+ import torch
6
+ import spaces # Optional: run our model on the GPU (this will be much faster inference)
7
+
8
+ import gradio as gr
9
+
10
+ from transformers import AutoModelForCausalLM, AutoTokenizer
11
+ from transformers import pipeline
12
+
13
+ @spaces.GPU # Optional: run our model on the GPU (this will be much faster inference)
14
+ def pred_on_text(input_text):
15
+ start_time = time.time()
16
+
17
+ raw_output = loaded_model_pipeline(text_inputs=[{"role": "user",
18
+ "content": input_text}],
19
+ max_new_tokens=256,
20
+ disable_compile=True)
21
+ end_time = time.time()
22
+ total_time = round(end_time - start_time, 4)
23
+
24
+ generated_text = raw_output[0]["generated_text"][1]["content"]
25
+
26
+ return generated_text, raw_output, total_time
27
+
28
+ # Load the model (from our Hugging Face Repo)
29
+ # Note: You may have to replace my username `mrdbourke` for your own
30
+ MODEL_PATH = "mrdbourke/FoodExtract-gemma-3-270m-fine-tune-v1"
31
+
32
+ # Load the model into a pipeline
33
+ loaded_model = AutoModelForCausalLM.from_pretrained(
34
+ pretrained_model_name_or_path=MODEL_PATH,
35
+ dtype="auto",
36
+ device_map="auto",
37
+ attn_implementation="eager"
38
+ )
39
+
40
+ # Load the tokenizer
41
+ tokenizer = AutoTokenizer.from_pretrained(
42
+ pretrained_model_name_or_path=MODEL_PATH,
43
+ )
44
+
45
+ # Create model pipeline
46
+ loaded_model_pipeline = pipeline("text-generation",
47
+ model=loaded_model,
48
+ tokenizer=tokenizer)
49
+
50
+ # Create the demo
51
+ description = """Extract food and drink items from text with a fine-tuned SLM (Small Language Model) or more specifically a fine-tuned [Gemma 3 270M](https://huggingface.co/google/gemma-3-270m-it).
52
+
53
+ Our model has been fine-tuned on the [FoodExtract-1k dataset](https://huggingface.co/datasets/mrdbourke/FoodExtract-1k).
54
+
55
+ * Input (str): Raw text strings or image captions (e.g. "A photo of a dog sitting on a beach" or "A breakfast plate with bacon, eggs and toast")
56
+ * Output (str): Generated text with food/not_food classification as well as noun extracted food and drink items and various food tags.
57
+
58
+ For example:
59
+
60
+ * Input: "For breakfast I had eggs, bacon and toast and a glass of orange juice"
61
+ * Output:
62
+
63
+ ```
64
+ food_or_drink: 1
65
+ tags: fi, di
66
+ foods: eggs, bacon, toast
67
+ drinks: orange juice
68
+ ```
69
+
70
+ See full fine-tuning code at [learnhuggingface.com](https://www.learnhuggingface.com/notebooks/hugging_face_llm_full_fine_tune_tutorial).
71
+ """
72
+
73
+ # Create the Gradio text in and out interface
74
+ demo = gr.Interface(fn=pred_on_text,
75
+ inputs=gr.TextArea(lines=4, label="Input Text"),
76
+ outputs=[gr.TextArea(lines=4, label="Generated Text"),
77
+ gr.TextArea(lines=7, label="Raw Output"),
78
+ gr.Number(label="Generation Time (s)")],
79
+ title="🍳 Structured FoodExtract with a Fine-Tuned Gemma 3 270M",
80
+ description=description,
81
+ examples=[["Hello world! This is my first fine-tuned LLM!"],
82
+ ["A plate of food with grilled barramundi, salad with avocado, olives, tomatoes and Italian dressing"],
83
+ ["British Breakfast with baked beans, fried eggs, black pudding, sausages, bacon, mushrooms, a cup of tea and toast and fried tomatoes"],
84
+ ["Steak tacos"],
85
+ ["A photo of a dog sitting on a beach"]]
86
+ )
87
+
88
+ if __name__ == "__main__":
89
+ demo.launch(share=False)
requirements.txt ADDED
@@ -0,0 +1,4 @@
 
 
 
 
 
1
+ transformers
2
+ gradio
3
+ torch
4
+ accelerate