Update README.md

a4751d2 verified 10 months ago

8.51 kB

	---
	license: mit
	language:
	- en
	metrics:
	- accuracy
	base_model:
	- google/efficientnet-b0
	pipeline_tag: image-classification
	---


	# Model Card for Food Vision Model

	This model is an image classification model trained to identify different types of food from images. It was developed as part of a Food Vision project, utilizing transfer learning on a pre-trained convolutional neural network.

	---

	# Model Details

	### Model Description

	This model is a deep learning model for classifying food images into one of 101 categories from the Food101 dataset. It was trained using TensorFlow and likely employs a transfer learning approach, leveraging the features learned by a model pre-trained on a large dataset like ImageNet. The training process included the use of mixed precision for potentially faster training and reduced memory usage.

	* Developed by: `Recompense` Me!
	* Model type: Image Classification (likely Transfer Learning with a CNN backbone)
	* Language(s) (NLP): N/A (Image Classification)
	* License: MIT
	* Finetuned from model: EfficienntNetB0


	# Uses

	This model is intended for classifying images of food into 101 distinct categories. Potential use cases include:

	* Food recognition in mobile applications.
	* Organizing food images in databases.
	* Assisting in dietary tracking or recipe suggestions based on images.

	---

	# Limitations

	* Dataset Bias: The model is trained on the Food101 dataset. Its performance may degrade on food images that are significantly different in style, presentation, or origin from those in the training data.
	* Image Quality: Performance can be affected by image quality, lighting conditions, occlusions, and variations in food presentation.
	* Specificity: While it classifies into 101 categories, it may not distinguish between very similar dishes or variations within a category.

	---

	# Evaluation

	The model's performance was evaluated using standard classification metrics on a validation set from the Food101 dataset.

	#### Testing Data

	The model was evaluated on the validation split of the Food101 dataset.

	* Food101 Dataset: A dataset of 101 food categories, with 101,000 images. 750 training images and 250 testing images per class.
	* Source: [TensorFlow Datasets](https://www.tensorflow.org/datasets/catalog/food101)

	#### Factors

	Evaluation was performed on the overall validation dataset. Further analysis could involve disaggregating performance by individual food categories to identify classes where the model performs better or worse.

	#### Metrics

	The primary evaluation metric used is Accuracy. A confusion matrix was also generated to visualize per-class performance.

	* Accuracy: The proportion of correctly classified images out of the total number of images evaluated.

	$$
	\text{Accuracy} = \frac{\text{Number of correct predictions}}{\text{Total number of predictions}}
	$$


	* Confusion Matrix: A table that visualizes the performance of a classification model. Each row represents the instances in an actual class, while each column represents the instances in a predicted class.

	### Results

	70-80% Fluctuating accuracy on validation data

	#### Summary

	Transfer learning helped the model achieve greater accuracy, though the model struggled with food closely related to each other indicating more data was needed. The Dataset used a lot but more data is still needed to differentiate between closely looking food.

	---

	# Environmental Impact

	Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).

	* Hardware Type: Tesla T4
	* Hours used: 1 hour estimate(max)
	* Cloud Provider: Google Cloud
	* Compute Region: us-central
	* Carbon Emitted: 80 grams of CO2eq (estimated)

	---

	# Technical Specifications

	### Model Architecture and Objective

	The model is likely a fine-tuned convolutional neural network (CNN) classifier. The notebook mentions using mixed precision training, which suggests a modern CNN architecture compatible with `float16` data types. The objective is to minimize the classification loss (e.g., categorical cross-entropy) to accurately predict the food category given an image.

	### Compute Infrastructure

	The model was trained using a Tesla T4 GPU on Google Cloud in the us-central region. The estimated carbon emissions for 1 hour of training time on this setup are 80 grams of CO2eq. The environment was intended to support mixed precision training.

	### Software

	* TensorFlow
	* TensorFlow Datasets
	* NumPy
	* Matplotlib
	* Scikit-learn
	* Helper functions from `helper_functions.py` (likely for plotting, data handling)

	---

	# Usage

	Here's an example of how to use the model for inference on a new image. This assumes the model has been saved in a TensorFlow SavedModel format.

	First, make sure you have TensorFlow installed:

	```bash
	pip install tensorflow
	```

	Then, you can load the model and make a prediction:

	```python
	import tensorflow as tf
	import matplotlib.pyplot as plt
	import numpy as np
	import os
	import keras

	# Available backend options are: "jax", "torch", "tensorflow".

	os.environ["KERAS_BACKEND"] = "jax"

	loaded_model = keras.saving.load_model("hf://Recompense/FoodVision")

	# Define the class names (replace with the actual class names from your training)
	class_names = ['apple_pie', 'baby_back_ribs', 'baklava', 'beef_carpaccio', 'beef_tartare', 'beet_salad', 'beignets', 'bibimbap', 'bread_pudding', 'breakfast_burrito', 'bruschetta', 'buffalo_wings', 'caesar_salad', 'cannoli', 'caprese_salad', 'carrot_cake', 'cheesecake', 'cheese_plate', 'chicken_curry', 'chicken_quesadilla', 'chicken_wings', 'chocolate_cake', 'chocolate_mousse', 'churros', 'clam_chowder', 'club_sandwich', 'crab_cakes', 'creme_brulee', 'croque_madame', 'cup_cakes', 'deviled_eggs', 'donuts', 'dumplings', 'edamame', 'eggs_benedict', 'escargots', 'falafel', 'filet_mignon', 'fish_and_chips', 'foie_gras', 'french_fries', 'french_onion_soup', 'french_toast', 'fried_calamari', 'fried_chicken', 'frozen_yogurt', 'garlic_bread', 'gnocchi', 'greek_salad', 'grilled_cheese_sandwich', 'grilled_salmon', 'guacamole', 'gyros', 'hamburger', 'hot_dog', 'ice_cream', 'lasagna', 'lobster_bisque', 'lobster_roll_sandwich', 'macaroni_and_cheese', 'macarons', 'miso_soup', 'mussels', 'nachos', 'omelette', 'onion_rings', 'oysters', 'pad_thai', 'paella', 'pancakes', 'panna_cotta', 'peking_duck', 'pho', 'pizza', 'pork_chop', 'poutine', 'prime_rib', 'pulled_pork_sandwich', 'ramen', 'ravioli', 'red_velvet_cake', 'risotto', 'samosas', 'sashimi', 'scallops', 'shrimp_scampi', 'smores', 'spaghetti_bolognese', 'spaghetti_carbonara', 'spring_rolls', 'steak', 'strawberry_shortcake', 'sushi', 'tacos', 'takoyaki', 'tiramisu', 'tuna_tartare', 'waffles'] # Example class names

	# Create a function to load and prepare images (from your notebook)
	def load_prep_image(filepath, img_shape=224, scale=True):
	"""
	Reads in an image and preprocesses it for model prediction

	Args:
	filepath (str): path to target image
	img_shape (int): shape to resize image to. Default = 224
	scale (bool): Condition to scale image. Default = True

	Returns:
	Image Tensor of shape (img_shape, img_shape, 3)
	"""
	image = tf.io.read_file(filepath)
	image_tensor = tf.io.decode_image(image, channels=3)
	image_tensor = tf.image.resize(image_tensor, [img_shape, img_shape])
	if scale:
	# Scale image tensor to be between 0 and 1
	scaled_image_tensor = image_tensor / 255.
	return scaled_image_tensor
	else:
	return image_tensor

	# Load and preprocess a sample image
	# Replace 'path/to/your/image.jpg' with the actual path to your image
	sample_image_path = 'path/to/your/image.jpg'
	prepared_image = load_prep_image(sample_image_path)

	# Add a batch dimension to the image
	prepared_image = tf.expand_dims(prepared_image, axis=0)

	# Make a prediction
	predictions = loaded_model.predict(prepared_image)

	# Get the predicted class index
	predicted_class_index = np.argmax(predictions)

	# Get the predicted class name
	predicted_class_name = class_names[predicted_class_index]

	# Print the prediction
	print(f"The predicted food item is: {predicted_class_name}")

	# Optional: Display the image
	# img = plt.imread(sample_image_path)
	# plt.imshow(img)
	# plt.title(f"Prediction: {predicted_class_name}")
	# plt.axis('off')
	# plt.show()
	```