ssevan commited on
Commit
6b1764d
·
verified ·
1 Parent(s): a7c1704

Upload folder using huggingface_hub

Browse files
Files changed (5) hide show
  1. README.md +53 -0
  2. class_names.json +1 -0
  3. config.json +7 -0
  4. model.safetensors +3 -0
  5. preprocessor_config.json +23 -0
README.md ADDED
@@ -0,0 +1,53 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: mit
3
+ tags:
4
+ - vision
5
+ - food-recognition
6
+ - ingredients
7
+ - utensils
8
+ - portion-size
9
+ - computer-vision
10
+ - mobile
11
+ - ug-food-dataset
12
+ ---
13
+
14
+ # UG Food Detection Model
15
+
16
+ This model identifies food ingredients, utensils, and estimates portion sizes from images.
17
+
18
+ ## Model Description
19
+
20
+ This Vision Transformer (ViT) model is trained on the UG Food Dataset to recognize:
21
+ - Food ingredients: Various food items and ingredients
22
+ - Kitchen utensils: Cooking tools and equipment
23
+ - Portion sizes: Measurement estimates
24
+
25
+ ## Classes
26
+ The model can identify 40 classes.
27
+
28
+ ## Usage
29
+
30
+ ```python
31
+ from transformers import ViTImageProcessor, ViTForImageClassification
32
+ from PIL import Image
33
+ import torch
34
+
35
+ # Load model and processor
36
+ processor = ViTImageProcessor.from_pretrained("ssevan/ug-food-detector")
37
+ model = ViTForImageClassification.from_pretrained("ssevan/ug-food-detector")
38
+
39
+ # Process image
40
+ image = Image.open('food_image.jpg')
41
+ inputs = processor(image, return_tensors='pt')
42
+
43
+ # Get predictions
44
+ with torch.no_grad():
45
+ outputs = model(**inputs)
46
+ probabilities = torch.nn.functional.softmax(outputs.logits, dim=-1)
47
+ predicted_class_idx = torch.argmax(probabilities, dim=1).item()
48
+
49
+ print(f'Predicted class index: {predicted_class_idx}')
50
+ ```
51
+
52
+ ## Mobile Usage
53
+ This model is optimized for mobile deployment.
class_names.json ADDED
@@ -0,0 +1 @@
 
 
1
+ ["ugandan_rolex", "milk", "chicken_stew", "fish_stew", "nakati", "utensils", "peas_soup", "millet", "pumpkin", "_roasted_groundnuts", "beans_soup", "pilau", "sweet_potatoes", "Ground_nut_sauce", "banana_leaves", "chapati_street_food", "nsenene", "boiled_cassava", "irish_potatoes", "cassava", "chai", "yams", "ground_nuts", "beans", "tomatoes", "eggs", "maize", "posho", "beef_stew", "samosa", "matooke", "pumpkin_soup", "katogo", "onions", "matooke_meal", "ugandan_local_food", "pork_stew", "garlic", "peas", "rice"]
config.json ADDED
@@ -0,0 +1,7 @@
 
 
 
 
 
 
 
 
1
+ {
2
+ "model_type": "vit",
3
+ "num_classes": 40,
4
+ "image_size": 224,
5
+ "mobile_optimized": true,
6
+ "task": "food-ingredient-utensil-detection"
7
+ }
model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:22cc6f6f02e8b69e90badb7b24627fdafed5a3b593a6b7b462bcbbf675c5faab
3
+ size 343340872
preprocessor_config.json ADDED
@@ -0,0 +1,23 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "do_convert_rgb": null,
3
+ "do_normalize": true,
4
+ "do_rescale": true,
5
+ "do_resize": true,
6
+ "image_mean": [
7
+ 0.5,
8
+ 0.5,
9
+ 0.5
10
+ ],
11
+ "image_processor_type": "ViTImageProcessor",
12
+ "image_std": [
13
+ 0.5,
14
+ 0.5,
15
+ 0.5
16
+ ],
17
+ "resample": 2,
18
+ "rescale_factor": 0.00392156862745098,
19
+ "size": {
20
+ "height": 224,
21
+ "width": 224
22
+ }
23
+ }