| | --- |
| | license: mit |
| | language: |
| | - en |
| | metrics: |
| | - accuracy |
| | base_model: |
| | - google/efficientnet-b0 |
| | pipeline_tag: image-classification |
| | --- |
| | |
| |
|
| | # Model Card for Food Vision Model |
| |
|
| | This model is an image classification model trained to identify different types of food from images. It was developed as part of a Food Vision project, utilizing transfer learning on a pre-trained convolutional neural network. |
| |
|
| | --- |
| |
|
| | # Model Details |
| |
|
| | ### Model Description |
| |
|
| | This model is a deep learning model for classifying food images into one of 101 categories from the Food101 dataset. It was trained using TensorFlow and likely employs a transfer learning approach, leveraging the features learned by a model pre-trained on a large dataset like ImageNet. The training process included the use of mixed precision for potentially faster training and reduced memory usage. |
| |
|
| | * **Developed by:** `Recompense` Me! |
| | * **Model type:** Image Classification (likely Transfer Learning with a CNN backbone) |
| | * **Language(s) (NLP):** N/A (Image Classification) |
| | * **License:** MIT |
| | * **Finetuned from model:** EfficienntNetB0 |
| |
|
| |
|
| | # Uses |
| |
|
| | This model is intended for classifying images of food into 101 distinct categories. Potential use cases include: |
| |
|
| | * Food recognition in mobile applications. |
| | * Organizing food images in databases. |
| | * Assisting in dietary tracking or recipe suggestions based on images. |
| |
|
| | --- |
| |
|
| | # Limitations |
| |
|
| | * **Dataset Bias:** The model is trained on the Food101 dataset. Its performance may degrade on food images that are significantly different in style, presentation, or origin from those in the training data. |
| | * **Image Quality:** Performance can be affected by image quality, lighting conditions, occlusions, and variations in food presentation. |
| | * **Specificity:** While it classifies into 101 categories, it may not distinguish between very similar dishes or variations within a category. |
| |
|
| | --- |
| |
|
| | # Evaluation |
| |
|
| | The model's performance was evaluated using standard classification metrics on a validation set from the Food101 dataset. |
| |
|
| | #### Testing Data |
| |
|
| | The model was evaluated on the validation split of the Food101 dataset. |
| |
|
| | * **Food101 Dataset:** A dataset of 101 food categories, with 101,000 images. 750 training images and 250 testing images per class. |
| | * **Source:** [TensorFlow Datasets](https://www.tensorflow.org/datasets/catalog/food101) |
| |
|
| | #### Factors |
| |
|
| | Evaluation was performed on the overall validation dataset. Further analysis could involve disaggregating performance by individual food categories to identify classes where the model performs better or worse. |
| |
|
| | #### Metrics |
| |
|
| | The primary evaluation metric used is Accuracy. A confusion matrix was also generated to visualize per-class performance. |
| |
|
| | * **Accuracy:** The proportion of correctly classified images out of the total number of images evaluated. |
| |
|
| | $$ |
| | \text{Accuracy} = \frac{\text{Number of correct predictions}}{\text{Total number of predictions}} |
| | $$ |
| |
|
| |
|
| | * **Confusion Matrix:** A table that visualizes the performance of a classification model. Each row represents the instances in an actual class, while each column represents the instances in a predicted class. |
| |
|
| | ### Results |
| |
|
| | 70-80% Fluctuating accuracy on validation data |
| |
|
| | #### Summary |
| |
|
| | Transfer learning helped the model achieve greater accuracy, though the model struggled with food closely related to each other indicating more data was needed. The Dataset used a lot but more data is still needed to differentiate between closely looking food. |
| |
|
| | --- |
| |
|
| | # Environmental Impact |
| |
|
| | Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). |
| |
|
| | * **Hardware Type:** Tesla T4 |
| | * **Hours used:** 1 hour estimate(max) |
| | * **Cloud Provider:** Google Cloud |
| | * **Compute Region:** us-central |
| | * **Carbon Emitted:** 80 grams of CO2eq (estimated) |
| |
|
| | --- |
| |
|
| | # Technical Specifications |
| |
|
| | ### Model Architecture and Objective |
| |
|
| | The model is likely a fine-tuned convolutional neural network (CNN) classifier. The notebook mentions using mixed precision training, which suggests a modern CNN architecture compatible with `float16` data types. The objective is to minimize the classification loss (e.g., categorical cross-entropy) to accurately predict the food category given an image. |
| |
|
| | ### Compute Infrastructure |
| |
|
| | The model was trained using a Tesla T4 GPU on Google Cloud in the us-central region. The estimated carbon emissions for 1 hour of training time on this setup are 80 grams of CO2eq. The environment was intended to support mixed precision training. |
| |
|
| | ### Software |
| |
|
| | * TensorFlow |
| | * TensorFlow Datasets |
| | * NumPy |
| | * Matplotlib |
| | * Scikit-learn |
| | * Helper functions from `helper_functions.py` (likely for plotting, data handling) |
| |
|
| | --- |
| |
|
| | # Usage |
| |
|
| | Here's an example of how to use the model for inference on a new image. This assumes the model has been saved in a TensorFlow SavedModel format. |
| |
|
| | First, make sure you have TensorFlow installed: |
| |
|
| | ```bash |
| | pip install tensorflow |
| | ``` |
| |
|
| | Then, you can load the model and make a prediction: |
| |
|
| | ```python |
| | import tensorflow as tf |
| | import matplotlib.pyplot as plt |
| | import numpy as np |
| | import os |
| | import keras |
| | |
| | # Available backend options are: "jax", "torch", "tensorflow". |
| | |
| | os.environ["KERAS_BACKEND"] = "jax" |
| | |
| | loaded_model = keras.saving.load_model("hf://Recompense/FoodVision") |
| | |
| | # Define the class names (replace with the actual class names from your training) |
| | class_names = ['apple_pie', 'baby_back_ribs', 'baklava', 'beef_carpaccio', 'beef_tartare', 'beet_salad', 'beignets', 'bibimbap', 'bread_pudding', 'breakfast_burrito', 'bruschetta', 'buffalo_wings', 'caesar_salad', 'cannoli', 'caprese_salad', 'carrot_cake', 'cheesecake', 'cheese_plate', 'chicken_curry', 'chicken_quesadilla', 'chicken_wings', 'chocolate_cake', 'chocolate_mousse', 'churros', 'clam_chowder', 'club_sandwich', 'crab_cakes', 'creme_brulee', 'croque_madame', 'cup_cakes', 'deviled_eggs', 'donuts', 'dumplings', 'edamame', 'eggs_benedict', 'escargots', 'falafel', 'filet_mignon', 'fish_and_chips', 'foie_gras', 'french_fries', 'french_onion_soup', 'french_toast', 'fried_calamari', 'fried_chicken', 'frozen_yogurt', 'garlic_bread', 'gnocchi', 'greek_salad', 'grilled_cheese_sandwich', 'grilled_salmon', 'guacamole', 'gyros', 'hamburger', 'hot_dog', 'ice_cream', 'lasagna', 'lobster_bisque', 'lobster_roll_sandwich', 'macaroni_and_cheese', 'macarons', 'miso_soup', 'mussels', 'nachos', 'omelette', 'onion_rings', 'oysters', 'pad_thai', 'paella', 'pancakes', 'panna_cotta', 'peking_duck', 'pho', 'pizza', 'pork_chop', 'poutine', 'prime_rib', 'pulled_pork_sandwich', 'ramen', 'ravioli', 'red_velvet_cake', 'risotto', 'samosas', 'sashimi', 'scallops', 'shrimp_scampi', 'smores', 'spaghetti_bolognese', 'spaghetti_carbonara', 'spring_rolls', 'steak', 'strawberry_shortcake', 'sushi', 'tacos', 'takoyaki', 'tiramisu', 'tuna_tartare', 'waffles'] # Example class names |
| | |
| | # Create a function to load and prepare images (from your notebook) |
| | def load_prep_image(filepath, img_shape=224, scale=True): |
| | """ |
| | Reads in an image and preprocesses it for model prediction |
| | |
| | Args: |
| | filepath (str): path to target image |
| | img_shape (int): shape to resize image to. Default = 224 |
| | scale (bool): Condition to scale image. Default = True |
| | |
| | Returns: |
| | Image Tensor of shape (img_shape, img_shape, 3) |
| | """ |
| | image = tf.io.read_file(filepath) |
| | image_tensor = tf.io.decode_image(image, channels=3) |
| | image_tensor = tf.image.resize(image_tensor, [img_shape, img_shape]) |
| | if scale: |
| | # Scale image tensor to be between 0 and 1 |
| | scaled_image_tensor = image_tensor / 255. |
| | return scaled_image_tensor |
| | else: |
| | return image_tensor |
| | |
| | # Load and preprocess a sample image |
| | # Replace 'path/to/your/image.jpg' with the actual path to your image |
| | sample_image_path = 'path/to/your/image.jpg' |
| | prepared_image = load_prep_image(sample_image_path) |
| | |
| | # Add a batch dimension to the image |
| | prepared_image = tf.expand_dims(prepared_image, axis=0) |
| | |
| | # Make a prediction |
| | predictions = loaded_model.predict(prepared_image) |
| | |
| | # Get the predicted class index |
| | predicted_class_index = np.argmax(predictions) |
| | |
| | # Get the predicted class name |
| | predicted_class_name = class_names[predicted_class_index] |
| | |
| | # Print the prediction |
| | print(f"The predicted food item is: {predicted_class_name}") |
| | |
| | # Optional: Display the image |
| | # img = plt.imread(sample_image_path) |
| | # plt.imshow(img) |
| | # plt.title(f"Prediction: {predicted_class_name}") |
| | # plt.axis('off') |
| | # plt.show() |
| | ``` |