File size: 8,586 Bytes

---
license: mit
language:
- en
metrics:
- accuracy
base_model:
- google/efficientnet-b0
pipeline_tag: image-classification
---

# Model Card for Food Vision Model

This model is an image classification model trained to identify different types of food from images. It was developed as part of a Food Vision project, likely utilizing transfer learning on a pre-trained convolutional neural network.

## Model Details

### Model Description

This model is a deep learning model for classifying food images into one of 101 categories from the Food101 dataset. It was trained using TensorFlow and likely employs a transfer learning approach, leveraging the features learned by a model pre-trained on a large dataset like ImageNet. The training process included the use of mixed precision for potentially faster training and reduced memory usage.

* **Developed by:** Based on the notebook, this seems to be a personal project or tutorial. You should replace this with the actual developer's name or organization.

* **Model type:** Image Classification (likely Transfer Learning with a CNN backbone)

* **Language(s) (NLP):** N/A (Image Classification)

* **License:** MIT

* **Finetuned from model:** EfficienntNetB0

### Uses

This model is intended for classifying images of food into 101 distinct categories. Potential use cases include:

* Food recognition in mobile applications.

* Organizing food images in databases.

* Assisting in dietary tracking or recipe suggestions based on images.

## Limitations

* **Dataset Bias:** The model is trained on the Food101 dataset. Its performance may degrade on food images that are significantly different in style, presentation, or origin from those in the training data.

* **Image Quality:** Performance can be affected by image quality, lighting conditions, occlusions, and variations in food presentation.

* **Specificity:** While it classifies into 101 categories, it may not distinguish between very similar dishes or variations within a category.

## Evaluation

The model's performance was evaluated using standard classification metrics on a validation set from the Food101 dataset.

#### Testing Data

The model was evaluated on the validation split of the Food101 dataset.

* **Food101 Dataset:** A dataset of 101 food categories, with 101,000 images. 750 training images and 250 testing images per class.

* **Source:** [TensorFlow Datasets](https://www.tensorflow.org/datasets/catalog/food101)

#### Factors

Evaluation was performed on the overall validation dataset. Further analysis could involve disaggregating performance by individual food categories to identify classes where the model performs better or worse.

#### Metrics

The primary evaluation metric used is Accuracy. A confusion matrix was also generated to visualize per-class performance.

* **Accuracy:** The proportion of correctly classified images out of the total number of images evaluated.

* 
  $$
  \text{Accuracy} = \frac{\text{Number of correct predictions}}{\text{Total number of predictions}}
  $$


* **Confusion Matrix:** A table that visualizes the performance of a classification model. Each row represents the instances in an actual class, while each column represents the instances in a predicted class.

### Results

70-80% Fluctualting accuracy on validation data

#### Summary

Transfer learning helped the model achieve greater accuracy, though the model struggled with food closely related to each other indicating more data was needed. The Dataset used alot but more data is still needed to differentiate between closely looking food.

## Environmental Impact

Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).

* **Hardware Type:** Tesla T4

* **Hours used:** 1 hour estimate(max)

* **Cloud Provider:** Google Cloud

* **Compute Region:** us-central

* **Carbon Emitted:** 80 grams of CO2eq (estimated)

## Technical Specifications

### Model Architecture and Objective

The model is likely a fine-tuned convolutional neural network (CNN) classifier. The notebook mentions using mixed precision training, which suggests a modern CNN architecture compatible with `float16` data types. The objective is to minimize the classification loss (e.g., categorical cross-entropy) to accurately predict the food category given an image.

### Compute Infrastructure

The model was trained using a Tesla T4 GPU on Google Cloud in the us-central region. The estimated carbon emissions for 1 hour of training time on this setup are 80 grams of CO2eq. The environment was intended to support mixed precision training.

### Software

* TensorFlow

* TensorFlow Datasets

* NumPy

* Matplotlib

* Scikit-learn

* Helper functions from `helper_functions.py` (likely for plotting, data handling)

## Usage

Here's an example of how to use the model for inference on a new image. This assumes the model has been saved in a TensorFlow SavedModel format.

First, make sure you have TensorFlow installed:

```bash
pip install tensorflow
```

Then, you can load the model and make a prediction:
```python
import tensorflow as tf
import matplotlib.pyplot as plt
import numpy as np

# Assume the model is saved in a directory named 'food_vision_model'
loaded_model = tf.keras.models.load_model('food_vision_model')

# Define the class names (replace with the actual class names from your training)
class_names = ['apple_pie', 'baby_back_ribs', 'baklava', 'beef_carpaccio', 'beef_tartare', 'beet_salad', 'beignets', 'bibimbap', 'bread_pudding', 'breakfast_burrito', 'bruschetta', 'buffalo_wings', 'caesar_salad', 'cannoli', 'caprese_salad', 'carrot_cake', 'cheesecake', 'cheese_plate', 'chicken_curry', 'chicken_quesadilla', 'chicken_wings', 'chocolate_cake', 'chocolate_mousse', 'churros', 'clam_chowder', 'club_sandwich', 'crab_cakes', 'creme_brulee', 'croque_madame', 'cup_cakes', 'deviled_eggs', 'donuts', 'dumplings', 'edamame', 'eggs_benedict', 'escargots', 'falafel', 'filet_mignon', 'fish_and_chips', 'foie_gras', 'french_fries', 'french_onion_soup', 'french_toast', 'fried_calamari', 'fried_chicken', 'frozen_yogurt', 'garlic_bread', 'gnocchi', 'greek_salad', 'grilled_cheese_sandwich', 'grilled_salmon', 'guacamole', 'gyros', 'hamburger', 'hot_dog', 'ice_cream', 'lasagna', 'lobster_bisque', 'lobster_roll_sandwich', 'macaroni_and_cheese', 'macarons', 'miso_soup', 'mussels', 'nachos', 'omelette', 'onion_rings', 'oysters', 'pad_thai', 'paella', 'pancakes', 'panna_cotta', 'peking_duck', 'pho', 'pizza', 'pork_chop', 'poutine', 'prime_rib', 'pulled_pork_sandwich', 'ramen', 'ravioli', 'red_velvet_cake', 'risotto', 'samosas', 'sashimi', 'scallops', 'shrimp_scampi', 'smores', 'spaghetti_bolognese', 'spaghetti_carbonara', 'spring_rolls', 'steak', 'strawberry_shortcake', 'sushi', 'tacos', 'takoyaki', 'tiramisu', 'tuna_tartare', 'waffles'] # Example class names

# Create a function to load and prepare images (from your notebook)
def load_prep_image(filepath, img_shape=224, scale=True):
    """
        Reads in an image and preprocesses it for model prediction

        Args:
            filepath (str): path to target image
            img_shape (int): shape to resize image to. Default = 224
            scale (bool): Condition to scale image. Default = True

        Returns:
            Image Tensor of shape (img_shape, img_shape, 3)
    """
    image = tf.io.read_file(filepath)
    image_tensor = tf.io.decode_image(image, channels=3)
    image_tensor = tf.image.resize(image_tensor, [img_shape, img_shape])
    if scale:
        # Scale image tensor to be between 0 and 1
        scaled_image_tensor = image_tensor / 255.
        return scaled_image_tensor
    else:
        return image_tensor

# Load and preprocess a sample image
# Replace 'path/to/your/image.jpg' with the actual path to your image
sample_image_path = 'path/to/your/image.jpg'
prepared_image = load_prep_image(sample_image_path)

# Add a batch dimension to the image
prepared_image = tf.expand_dims(prepared_image, axis=0)

# Make a prediction
predictions = loaded_model.predict(prepared_image)

# Get the predicted class index
predicted_class_index = np.argmax(predictions)

# Get the predicted class name
predicted_class_name = class_names[predicted_class_index]

# Print the prediction
print(f"The predicted food item is: {predicted_class_name}")

# Optional: Display the image
# img = plt.imread(sample_image_path)
# plt.imshow(img)
# plt.title(f"Prediction: {predicted_class_name}")
# plt.axis('off')
# plt.show()
```