image-dataset-model / README.md
bareethul's picture
Update README.md
32bfb13 verified
|
raw
history blame
3.1 kB
metadata
license: cc
language:
  - en

Model Card for AutoML Cuisine Classification

This model card documents the AutoML Cuisine Classification model trained with AutoGluon Multimodal on a classmate’s dataset of food images.
The task is to predict whether a food image belongs to Asian or Western cuisine (binary classification).


Model Details

  • Developed by: Bareethul Kader
  • Framework: AutoGluon Multimodal
  • Repository: bareethul/image-dataset-model
  • License: CC BY 4.0

Intended Use

Direct Use

  • Educational demonstration of AutoML on an image classification task.
  • Comparison of different backbones (ResNet18, MobileNetV3, EfficientNet-B0).
  • Exploring effects of augmentation and model selection under constrained compute budget.

Out of Scope Use

  • Not intended for production deployments in food classification systems.
  • May not generalize to cuisines other than “Asian vs Western,” or to non-restaurant/home cooked settings.
  • Not meant for health/dietary or allergy related automation.

Dataset

  • Source: maryzhang/hw1-24679-image-dataset
  • Task: Binary image classification (label 0 = Western cuisine, label 1 = Asian cuisine)
  • Size:
    • Original images: 40
    • Augmented images: 320
    • Total: ≈ 360 images
  • Features:
    • image: Image (RGB, as provided)
    • label: Integer 0 or 1

Training Setup

  • AutoML framework: AutoGluon Multimodal (MultiModalPredictor)
  • Evaluation metric: Accuracy
  • Budget: 600 seconds (10 minutes) for quick runs; longer (~1800s) for full run and more accuracy.
  • Hardware: Google Colab (GPU, typical environment)
  • Search Space:
    • Backbones: resnet18, mobilenetv3_small_100, efficientnet_b0
  • Preprocessing / Augmentation: As provided in dataset (augmented split); resize and standard image transforms as in dataset loading

Results

  • Best model (AutoGluon selected): efficientnet_b0
  • Validation Accuracy: 0.96875

Error Analysis

  • The model reached accuracy and F1 of 1.0 on the test split. This is due to the dataset’s small size or possible overlap with augmented data. The results reflect dataset limitations rather than true generalization.

Limitations, Biases, and Ethical Notes

  • Small dataset size -> overfitting risk.
  • Augmented data may not capture all real world variance (lighting, background, etc.).
  • Binary classification “Asian vs Western” is coarse; many cuisines and dishes don’t neatly fit.
  • Labeling reflects simplified categories; cultural/geographic nuance lost.

Example Inference

from autogluon.multimodal import MultiModalPredictor

# Load the pretrained model
predictor = MultiModalPredictor.load("bareethul/image-dataset-model")

# Run inference on an image file
pred = predictor.predict("path/to/your_test_food_image.jpg")
print("Prediction:", pred)  # 0 = Western cuisine, 1 = Asian cuisine