Spaces:

DanielKiani
/

Food101-Classification

Sleeping

App Files Files Community

DanielKiani commited on Aug 29, 2025

Commit

43124a6

0 Parent(s):

Initial commit of Food101 Classification

Browse files

Files changed (18) hide show

.gitattributes +3 -0
.gitignore +18 -0
README.md +206 -0
assets/banner.png +3 -0
assets/confusion_matrix.png +3 -0
assets/gradio.png +3 -0
assets/onion_rings.jpg +3 -0
assets/oysters.jpg +3 -0
assets/pizza.jpg +3 -0
assets/ramen.jpg +3 -0
checkpoints/best-model-epoch=22-val_acc=0.8541.ckpt +3 -0
notebooks/food101_classification.ipynb +1055 -0
requirements.txt +0 -0
scripts/app.py +82 -0
scripts/class_names.py +22 -0
scripts/main.py +229 -0
scripts/models.py +266 -0
scripts/prepare_data.py +240 -0

.gitattributes ADDED Viewed

	@@ -0,0 +1,3 @@

+*.ckpt filter=lfs diff=lfs merge=lfs -text
+*.png filter=lfs diff=lfs merge=lfs -text
+*.jpg filter=lfs diff=lfs merge=lfs -text

.gitignore ADDED Viewed

	@@ -0,0 +1,18 @@

+# Python
+__pycache__/
+*.pyc
+# Virtual Environment
+venv/
+.venv/
+# Data and Logs
+data/
+logs/
+notebooks/data/
+notebooks/logs/
+# IDE files
+.vscode/
+.idea/

README.md ADDED Viewed

	@@ -0,0 +1,206 @@

+![Food101 Classification Banner](assets/banner.png)
+[![Python](https://img.shields.io/badge/Python-3.10-blue?logo=python)](https://www.python.org/)[![PyTorch](https://img.shields.io/badge/PyTorch-2.7.1-EE4C2C?logo=pytorch)](https://pytorch.org/)![Made with ML](https://img.shields.io/badge/Made%20with-ML-blueviolet?logo=openai)[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](LICENSE)
+# 🍽️ Food-101 Image Classification with EfficientNetV2-S and PyTorch Lightning
+This repository contains the code for an end-to-end deep learning project to classify 101 food categories from the challenging Food-101 dataset. The project demonstrates a systematic approach to model selection, fine-tuning, and hyperparameter optimization, achieving a final validation accuracy of **85.4%** on the full dataset.
+The entire training and evaluation pipeline is built using modern, reproducible practices with PyTorch Lightning.
+---
+## 📑 Table of Contents
+- [�️ Food-101 Image Classification with EfficientNetV2-S and PyTorch Lightning](#️-food-101-image-classification-with-efficientnetv2-s-and-pytorch-lightning)
+  - [📑 Table of Contents](#-table-of-contents)
+  - [🎯 Project Highlights](#-project-highlights)
+  - [💡 Real-World Applications](#-real-world-applications)
+  - [🧫 Experimental Results](#-experimental-results)
+  - [📊 Final Results](#-final-results)
+  - [🔬 Performance Analysis and Error Diagnosis](#-performance-analysis-and-error-diagnosis)
+      - [🍤 Lowest-Performing Classes](#-lowest-performing-classes)
+      - [Root Cause Analysis of Misclassifications](#root-cause-analysis-of-misclassifications)
+      - [Future Work](#future-work)
+  - [🧪 Methodology and Experimental Process](#-methodology-and-experimental-process)
+  - [📁 Repository Structure](#-repository-structure)
+  - [🚀 Getting Started](#-getting-started)
+    - [Prerequisites](#prerequisites)
+    - [Installation](#installation)
+    - [Usage](#usage)
+    - [💻 Technologies Used](#-technologies-used)
+---
+## 🎯 Project Highlights
+- **High-Performance Model** ⚡: Utilizes a pre-trained `EfficientNetV2-S`, selected for its excellent balance of accuracy and computational efficiency suitable for potential edge deployment.
+- **Reproducible Pipeline** 🔄: Encapsulates the entire workflow—from data loading to training and evaluation—in a clean and organized `LightningModule` and `DataModule`.
+- **Efficient Experimentation** ⏱️: Overcame hardware limitations by implementing dataset subsetting for rapid prototyping.
+- **Advanced Fine-Tuning** 🛠️: Implemented a robust fine-tuning strategy, unfreezing the final three blocks of the feature extractor and using the `Adam` optimizer with a `CosineAnnealingLR` scheduler for stable convergence.
+- **In-Depth Analysis** 🔎: Went beyond simple accuracy by calculating and logging per-class F1-scores and accuracies, enabling a deep dive into the model's strengths and weaknesses.
+- **Live Deployment** 📺: The final model is deployed and accessible as an interactive Gradio web application on Hugging Face Spaces.
+---
+## 💡 Real-World Applications
+Beyond being a technical challenge, this food classification model serves as a foundation for numerous real-world applications in health, hospitality, and smart home technology.
+- **Health and Nutrition Tracking**
+  - **Automated Calorie Counting:** Users can snap a photo of their meal, and an app can automatically identify each food item to provide an instant estimate of calories, macros, and other nutritional information.
+  - **Dietary Management:** Assists individuals with allergies or specific dietary needs (e.g., diabetes, gluten-free) by helping them identify and log their food intake accurately.
+- **Restaurant and Hospitality Tech**
+  - **Self-Checkout Systems:** In cafeterias or quick-service restaurants, a camera-based system could identify all items on a tray to automate the billing process, reducing queues and improving efficiency.
+    - **Interactive Menus:** Allow diners to point their phone at a dish to get more information, such as ingredients, allergen warnings, or customer reviews.
+- **Smart Home and Appliances**
+  - **Smart Refrigerators:** A fridge equipped with a camera could identify leftover dishes, suggest recipes based on available food, and help track food spoilage to reduce waste.
+---
+## 🧫 Experimental Results
+This project followed an iterative approach. The table below summarizes the key experiments and their outcomes, showing the progression from the initial baseline to the final model.
+| Model | Training Strategy | Data % | Key Hyperparameters | Final Val Accuracy |
+| :--- | :--- | :--- | :--- | :--- |
+| `EfficientNet-B2` | Simple fine-tune (last block) | 50% | `lr=1e-4` | ~64% |
+| `EfficientNet-B2` | Unfreeze last 3 blocks | 50% | `lr=1e-3` | 82.0% |
+| `EfficientNet-B2` | Two-Stage Fine-Tuning | 50% | `lr1=1e-3`, `lr2=1e-5` | Performance Degraded |
+| **`EfficientNetV2-S`** | Unfreeze last 3 blocks | 50% | `lr=1e-4` (Tuned) | 82.4% |
+| **`EfficientNetV2-S`** | Unfreeze last 3 blocks and more advanced transforms | 50% | `lr=1e-4` (Tuned) | ~82.4% Pretty much the same Performance|
+| **`EfficientNetV2-S`** | **Unfreeze last 3 blocks** | **100%** | **`lr=1e-4` (Tuned)** | **85.4%** |
+---
+## 📊 Final Results
+After systematically iterating on model architecture and hyperparameters, the final model achieved the following performance on the full Food-101 validation set:
+| Metric              | Score   |
+| :------------------ | :------ |
+| Validation Accuracy | **85.4%** |
+![Confusion Matrix Plot](assets/confusion_matrix.png)
+*A confusion matrix visualization helps diagnose the model's performance on a per-class basis. (Replace with your own plot)*
+This model is deployed and accessible as an interactive Gradio web application on Hugging Face Spaces.
+![Gradio](assets/gradio.png)
+Check out my [Food101 Gradio Demo](https://huggingface.co/spaces/your-username/food101-demo).
+---
+## 🔬 Performance Analysis and Error Diagnosis
+Beyond the aggregate accuracy, a per-class analysis was conducted to identify the model's specific limitations and diagnose the root causes of misclassifications.
+The model performed exceptionally well on many classes but struggled with a distinct set of categories, primarily due to visual ambiguity and high variability in appearance.
+#### 🍤 Lowest-Performing Classes
+The following five classes had the lowest validation accuracy:
+| Class Name          | Index | Validation Accuracy |
+| :------------------ | :---- | :------------------ |
+| `shrimp_and_grits`  | 93    | 44.0%               |
+| `ravioli`           | 77    | 59.2%               |
+| `apple_pie`         | 0     | 61.6%               |
+| `huevos_rancheros`  | 56    | 63.2%               |
+| `falafel`           | 36    | 63.6%               |
+#### Root Cause Analysis of Misclassifications
+* **High Intra-Class Variation**: The model struggled with dishes that have no single, consistent appearance.
+* **Fine-Grained Confusion**: Errors occurred between visually similar classes like `ravioli` vs. `dumplings`.
+* **Ambiguous Features**: Foods like `falafel` resemble many small fried dishes, making classification tricky.
+#### Future Work
+Improvements could include:
+- Detailed confusion matrix analysis 🔍
+- More aggressive data augmentation 📈
+- Larger architectures for fine-grained recognition 🏋️
+- Training for longer 🏋️
+---
+## 🧪 Methodology and Experimental Process
+Steps taken in the project:
+1. **Baseline Establishment** 🏁 – EfficientNet-B2 achieved ~64%.
+2. **Architecture Selection** 🏗️ – EfficientNetV2-S chosen for balance of accuracy and size.
+3. **Transforms Selection** 🎨 – TrivialAugmentWide + RandomResizedCrop, RandAugment, etc.
+4. **Fine-Tuning Strategy** 🔧 – Final 3 blocks unfrozen for training.
+5. **Final Model Training** 🏆 – Full dataset, Adam, CosineAnnealingLR, EarlyStopping → 85.4%.
+---
+## 📁 Repository Structure
+```bash
+food-101-classification/
+├── data/
+├── logs/
+├── scripts/
+│   ├── main.py
+│   ├── models.py
+│   ├── class_names.py
+│   ├── app.py
+│   └── prepare_data.py
+├── .gitignore
+├── requirements.txt
+└── README.md
+```
+---
+## 🚀 Getting Started
+### Prerequisites
+- Python 3.10+ 🐍
+- PyTorch 🔥
+- CUDA-enabled GPU (recommended) 🎮
+### Installation
+1. **Clone the repository:**
+    ```bash
+    git clone https://github.com/Deathshot78/Food101-Classification
+    cd Food101-Classification
+    ```
+2. **Install the dependencies:**
+    ```bash
+    pip install -r requirements.txt
+    ```
+### Usage
+Run training with a subset for quick testing:
+```bash
+python main.py
+```
+### 💻 Technologies Used
+- Python
+- PyTorch
+- PyTorch Lightning
+- TorchMetrics
+- Gradio
+- Matplotlib & Seaborn

assets/banner.png ADDED Viewed

Git LFS Details

SHA256: 9093c3aa4394c9f611a6983c479b21e2e26f8f68cd35d9d5a7f0e589cfd8202f
Pointer size: 132 Bytes
Size of remote file: 1.67 MB

assets/confusion_matrix.png ADDED Viewed

Git LFS Details

SHA256: 5b76963efa5e9a0ab0ca5ac3dc2d1ba63f96e77ec45ecc6386cfdaab0051bf94
Pointer size: 132 Bytes
Size of remote file: 1.38 MB

assets/gradio.png ADDED Viewed

Git LFS Details

SHA256: 293eeb6050e65e7ad27b9a6317090f7d721a58413d297c6e7ead2f3657b5a0f7
Pointer size: 131 Bytes
Size of remote file: 615 kB

assets/onion_rings.jpg ADDED Viewed

Git LFS Details

SHA256: 0b259ff866df09cab2c71c479c9d5e4b3273baa585b836349d89beaceb05149e
Pointer size: 130 Bytes
Size of remote file: 12.2 kB

assets/oysters.jpg ADDED Viewed

Git LFS Details

SHA256: aed7e730dddb604351a5b451b2503e350f7cb2cd78cad33076867736effa9bad
Pointer size: 131 Bytes
Size of remote file: 130 kB

assets/pizza.jpg ADDED Viewed

Git LFS Details

SHA256: 00896d5bdd63e1f074009ff7c607ce89cd3e37c8e62c7f190d57e2059f560b79
Pointer size: 130 Bytes
Size of remote file: 16 kB

assets/ramen.jpg ADDED Viewed

Git LFS Details

SHA256: 8a7186765988f80cbae7438b5aec7238d043b182676527ebc089d84114c1cc67
Pointer size: 131 Bytes
Size of remote file: 239 kB

checkpoints/best-model-epoch=22-val_acc=0.8541.ckpt ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:6663c5df734fa5e9a50fab801a746ebd095732faacb1938368444518c3265615
+size 230292623

notebooks/food101_classification.ipynb ADDED Viewed

	@@ -0,0 +1,1055 @@

+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "id": "430db510",
+   "metadata": {},
+   "source": [
+    "# Food-101 Image Classification with EfficientNetV2-S and PyTorch Lightning\n",
+    "\n",
+    "This repository contains the code for an end-to-end deep learning project to classify 101 food categories from the challenging Food-101 dataset. The project demonstrates a systematic approach to model selection, fine-tuning, and hyperparameter optimization, achieving a final validation accuracy of **85.4%** on the full dataset.\n",
+    "\n",
+    "The entire training and evaluation pipeline is built using modern, reproducible practices with PyTorch Lightning."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "1116006e",
+   "metadata": {},
+   "source": [
+    "## 1. Imports"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "531943f8",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "import torch\n",
+    "import matplotlib.pyplot as plt\n",
+    "import pandas as pd\n",
+    "import numpy as np\n",
+    "import os\n",
+    "import pytorch_lightning as pl\n",
+    "import torch.optim.lr_scheduler as lr_scheduler\n",
+    "import torchvision\n",
+    "\n",
+    "from torchmetrics.functional import accuracy\n",
+    "from torchvision import transforms, datasets\n",
+    "from torchinfo import summary\n",
+    "from pytorch_lightning.callbacks import EarlyStopping\n",
+    "from pytorch_lightning.loggers import TensorBoardLogger\n",
+    "from torch import nn\n",
+    "from pathlib import Path\n",
+    "from torch.utils.data import DataLoader, TensorDataset, Dataset, random_split"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "08a5c10b",
+   "metadata": {},
+   "source": [
+    "## 2. Quick inspection of the top model"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "e20ad559",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# Here we inspect the models classifier layer to match the number of classes in Food101\n",
+    "\n",
+    "weights = torchvision.models.EfficientNet_V2_S_Weights.DEFAULT\n",
+    "model = torchvision.models.efficientnet_v2_s(weights=weights)\n",
+    "effnet_v2_s_transforms = weights.transforms()\n",
+    "\n",
+    "print(model.classifier)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "3442b5a9",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# Inspect the model\n",
+    "\n",
+    "summary(model=model,\n",
+    "        input_size=(1, 3, 224, 224),\n",
+    "        col_names=['input_size', 'output_size', 'num_params', 'trainable'],\n",
+    "        col_width=20,\n",
+    "        row_settings=['var_names'])"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "bb353575",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# This will be the base transforms for training \n",
+    "\n",
+    "effnet_v2_s_transforms = weights.transforms()\n",
+    "train_transforms = torchvision.transforms.Compose([\n",
+    "    torchvision.transforms.TrivialAugmentWide(),\n",
+    "    effnet_v2_s_transforms])\n",
+    "\n",
+    "train_transforms"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "363ce600",
+   "metadata": {},
+   "source": [
+    "## 3. Dataset and Torch lightning Datamodule Classes"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "2a04cc09",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from torchvision import datasets\n",
+    "from pathlib import Path\n",
+    "import os\n",
+    "import pytorch_lightning as pl\n",
+    "from torch.utils.data import DataLoader, Subset\n",
+    "from torchvision import datasets\n",
+    "from torchvision import transforms as T\n",
+    "import numpy as np\n",
+    "import torchvision\n",
+    "from torchvision.datasets import Food101\n",
+    "from torch.utils.data import DataLoader, Dataset\n",
+    "from typing import Dict, Tuple, Any\n",
+    "import random\n",
+    "\n",
+    "\n",
+    "def get_model_components(\n",
+    "    model_name: str, \n",
+    "    return_classifier: bool = False, \n",
+    "    augmentation_level: str = \"default\"\n",
+    ") -> Dict[str, Any]:\n",
+    "    \"\"\"\n",
+    "    Retrieves pre-trained model components from torchvision.\n",
+    "\n",
+    "    This function fetches the appropriate weights and transforms for a given\n",
+    "    model. It supports different levels of training data augmentation.\n",
+    "\n",
+    "    Args:\n",
+    "        model_name (str): The name of the model to get components for.\n",
+    "            Supported models include \"EfficientNet_V2_S\" and \"EfficientNet_B2\".\n",
+    "        return_classifier (bool, optional): If True, the model's classifier\n",
+    "            head is also returned. Defaults to False.\n",
+    "        augmentation_level (str, optional): The level of data augmentation to use\n",
+    "            for the training set. Can be \"default\" or \"strong\". \n",
+    "            Defaults to \"default\".\n",
+    "\n",
+    "    Returns:\n",
+    "        Dict[str, Any]: A dictionary containing the requested components.\n",
+    "            Always includes 'train_transforms' and 'val_transforms'.\n",
+    "            Includes 'classifier' if return_classifier is True.\n",
+    "            \n",
+    "    Raises:\n",
+    "        ValueError: If model_name or augmentation_level is not supported.\n",
+    "    \"\"\"\n",
+    "    model_registry = {\n",
+    "        \"EfficientNet_V2_S\": (\n",
+    "            torchvision.models.efficientnet_v2_s,\n",
+    "            torchvision.models.EfficientNet_V2_S_Weights.DEFAULT\n",
+    "        ),\n",
+    "        \"EfficientNet_B2\": (\n",
+    "            torchvision.models.efficientnet_b2,\n",
+    "            torchvision.models.EfficientNet_B2_Weights.DEFAULT\n",
+    "        )\n",
+    "    }\n",
+    "\n",
+    "    if model_name not in model_registry:\n",
+    "        raise ValueError(f\"Model '{model_name}' is not supported. \"\n",
+    "                         f\"Supported models are: {list(model_registry.keys())}\")\n",
+    "\n",
+    "    # 1. Look up the model and weights classes\n",
+    "    model_class, weights_class = model_registry[model_name]\n",
+    "    weights = weights_class\n",
+    "    val_transforms = weights.transforms()\n",
+    "\n",
+    "    # 2. Create the training transforms based on the desired level\n",
+    "    if augmentation_level == \"default\":\n",
+    "        train_transforms = T.Compose([\n",
+    "            T.TrivialAugmentWide(),\n",
+    "            val_transforms  # val_transforms includes ToTensor and Normalize\n",
+    "        ])\n",
+    "    elif augmentation_level == \"strong\":\n",
+    "        # Note: We don't need to add ToTensor() or Normalize() here because\n",
+    "        # they are already included inside the 'val_transforms' pipeline.\n",
+    "        train_transforms = T.Compose([\n",
+    "            T.RandomResizedCrop(size=val_transforms.crop_size, scale=(0.7, 1.0)),\n",
+    "            T.RandomHorizontalFlip(p=0.5),\n",
+    "            T.RandAugment(num_ops=2, magnitude=9),\n",
+    "            # RandomErasing should be applied to a tensor, so we apply it after\n",
+    "            # val_transforms, which handles the PIL -> Tensor conversion.\n",
+    "            val_transforms, \n",
+    "            T.RandomErasing(p=0.25, scale=(0.02, 0.33), ratio=(0.3, 3.3), value='random')\n",
+    "        ])\n",
+    "    else:\n",
+    "        raise ValueError(f\"Augmentation level '{augmentation_level}' is not supported. \"\n",
+    "                         f\"Choose from 'default' or 'strong'.\")\n",
+    "    \n",
+    "    # 3. Prepare the dictionary to be returned\n",
+    "    components = {\n",
+    "        \"train_transforms\": train_transforms,\n",
+    "        \"val_transforms\": val_transforms\n",
+    "    }\n",
+    "\n",
+    "    # 4. Optionally, instantiate the model to get the classifier\n",
+    "    if return_classifier:\n",
+    "        model = model_class(weights=weights)\n",
+    "        components[\"classifier\"] = model.classifier\n",
+    "\n",
+    "    return components\n",
+    "        \n",
+    "class CustomFood101(Dataset):\n",
+    "    \"\"\"A PyTorch Dataset for Food101 with conditional downloading and subset support.\n",
+    "\n",
+    "    This class wraps the torchvision Food101 dataset. It only downloads the data\n",
+    "    if the specified directory doesn't already exist. It can also create a\n",
+    "    reproducible, shuffled subset of the data for faster experimentation.\n",
+    "\n",
+    "    Args:\n",
+    "        split (str): The dataset split, either \"train\" or \"test\".\n",
+    "        transform (callable, optional): A function/transform to apply to the images.\n",
+    "        data_dir (str, optional): The directory to store the data. Defaults to \"data\".\n",
+    "        subset_fraction (float, optional): The fraction of the dataset to use.\n",
+    "            Defaults to 1.0 (using the full dataset).\n",
+    "    \"\"\"\n",
+    "\n",
+    "    def __init__(self, split, transform=None, data_dir=\"data\", subset_fraction: float = 0.1):\n",
+    "        # Check if the dataset already exists before setting the download flag.\n",
+    "        dataset_path = os.path.join(data_dir, \"food-101\")\n",
+    "        should_download = not os.path.isdir(dataset_path)\n",
+    "\n",
+    "        # 1. Load the full dataset metadata with the conditional flag\n",
+    "        self.full_dataset = Food101(root=data_dir, split=split, transform=transform, download=should_download)\n",
+    "        self.classes = self.full_dataset.classes\n",
+    "\n",
+    "        # 2. Create a reproducible subset of indices\n",
+    "        if subset_fraction < 1.0:\n",
+    "            num_samples = int(len(self.full_dataset) * subset_fraction)\n",
+    "            all_indices = list(range(len(self.full_dataset)))\n",
+    "            # Shuffle with a fixed seed for reproducibility\n",
+    "            random.Random(42).shuffle(all_indices)\n",
+    "            self.indices = all_indices[:num_samples]\n",
+    "        else:\n",
+    "            self.indices = list(range(len(self.full_dataset)))\n",
+    "\n",
+    "    def __len__(self):\n",
+    "        \"\"\"Returns the total number of samples in the subset.\"\"\"\n",
+    "        return len(self.indices)\n",
+    "\n",
+    "    def __getitem__(self, idx):\n",
+    "        \"\"\"\n",
+    "        Fetches the sample for the given subset index and applies the transform.\n",
+    "        \"\"\"\n",
+    "        # Map the subset index to the actual index in the full dataset\n",
+    "        original_idx = self.indices[idx]\n",
+    "        image, label = self.full_dataset[original_idx]\n",
+    "        return image, label\n",
+    "\n",
+    "class Food101DataModule(pl.LightningDataModule):\n",
+    "    \"\"\"A PyTorch Lightning DataModule for the Food101 dataset.\n",
+    "\n",
+    "    This module encapsulates all data-related logic, including downloading,\n",
+    "    processing, and creating DataLoaders for the training, validation, and\n",
+    "    test sets. It uses the CustomFood101 dataset internally and allows for\n",
+    "    controlling the fraction of data used in the training and validation splits.\n",
+    "\n",
+    "    Args:\n",
+    "        data_dir (str, optional): Root directory for the data. Defaults to \"data\".\n",
+    "        batch_size (int, optional): The batch size for DataLoaders. Defaults to 32.\n",
+    "        num_workers (int, optional): Number of workers for data loading. Defaults to 2.\n",
+    "        train_transforms (callable, optional): Transformations for the training set.\n",
+    "        val_transforms (callable, optional): Transformations for the validation/test set.\n",
+    "        subset_fraction (float, optional): The fraction of data to use for training\n",
+    "            and validation. Defaults to 1.0.\n",
+    "    \"\"\"\n",
+    "    def __init__(self, data_dir=\"data\", batch_size=32, num_workers=2,\n",
+    "                 train_transforms=None, val_transforms=None, subset_fraction: float = 0.5):\n",
+    "        super().__init__()\n",
+    "        self.data_dir = data_dir\n",
+    "        self.batch_size = batch_size\n",
+    "        self.num_workers = num_workers\n",
+    "        self.train_transforms = train_transforms\n",
+    "        self.val_transforms = val_transforms\n",
+    "        self.subset_fraction = subset_fraction\n",
+    "\n",
+    "        self.classes = []\n",
+    "\n",
+    "    def prepare_data(self):\n",
+    "        \"\"\"Downloads data if needed.\"\"\"\n",
+    "        CustomFood101(split='train', data_dir=self.data_dir)\n",
+    "        CustomFood101(split='test', data_dir=self.data_dir)\n",
+    "\n",
+    "    def setup(self, stage=None):\n",
+    "        \"\"\"Assigns datasets, passing the subset_fraction.\"\"\"\n",
+    "        if stage == 'fit' or stage is None:\n",
+    "            self.train_dataset = CustomFood101(split='train', transform=self.train_transforms,\n",
+    "                                               data_dir=self.data_dir, subset_fraction=self.subset_fraction)\n",
+    "            self.val_dataset = CustomFood101(split='test', transform=self.val_transforms,\n",
+    "                                             data_dir=self.data_dir, subset_fraction=self.subset_fraction)\n",
+    "            self.classes = self.train_dataset.classes\n",
+    "\n",
+    "        if stage == 'test' or stage is None:\n",
+    "            self.test_dataset = CustomFood101(split='test', transform=self.val_transforms,\n",
+    "                                              data_dir=self.data_dir, subset_fraction=1.0) # Use full test set\n",
+    "            if not self.classes:\n",
+    "                self.classes = self.test_dataset.classes\n",
+    "\n",
+    "    def train_dataloader(self):\n",
+    "        return DataLoader(self.train_dataset, batch_size=self.batch_size, shuffle=True, num_workers=self.num_workers)\n",
+    "\n",
+    "    def val_dataloader(self):\n",
+    "        return DataLoader(self.val_dataset, batch_size=self.batch_size, shuffle=False, num_workers=self.num_workers)\n",
+    "\n",
+    "    def test_dataloader(self):\n",
+    "        return DataLoader(self.test_dataset, batch_size=self.batch_size, shuffle=False, num_workers=self.num_workers)\n"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "93ffd2e2",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# Define configuration for the script\n",
+    "DATA_DIR = \"data\"\n",
+    "MODEL_NAME = \"EfficientNet_V2_S\"\n",
+    "BATCH_SIZE = 32\n",
+    "\n",
+    "print(f\"Running data preparation script for model: {MODEL_NAME}\")\n",
+    "\n",
+    "# 1. Get model-specific transforms\n",
+    "components = get_model_components(MODEL_NAME)\n",
+    "train_transforms = components[\"train_transforms\"]\n",
+    "val_transforms = components[\"val_transforms\"]\n",
+    "\n",
+    "# 2. Instantiate the DataModule\n",
+    "datamodule = Food101DataModule(\n",
+    "    data_dir=DATA_DIR,\n",
+    "    batch_size=BATCH_SIZE,\n",
+    "    train_transforms=train_transforms,\n",
+    "    val_transforms=val_transforms,\n",
+    "    subset_fraction=0.1  # Use a small subset for quick verification\n",
+    ")\n",
+    "\n",
+    "# 3. Trigger download and setup\n",
+    "datamodule.prepare_data()\n",
+    "datamodule.setup(stage='fit')\n",
+    "\n",
+    "# 4. (Optional) Verification Step\n",
+    "print(\"\\n--- Verifying Dataloader ---\")\n",
+    "# Get one batch from the training dataloader\n",
+    "train_dl = datamodule.train_dataloader()\n",
+    "images, labels = next(iter(train_dl))\n",
+    "\n",
+    "print(f\"Number of classes: {len(datamodule.classes)}\")\n",
+    "print(f\"Image batch shape: {images.shape}\")\n",
+    "print(f\"Label batch shape: {labels.shape}\")\n",
+    "print(\"--- Verification Complete ---\")    "
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "3edf64c2",
+   "metadata": {},
+   "source": [
+    "## 4. Model Classes"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "eb264fe4",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "import torch\n",
+    "import torchvision\n",
+    "import pytorch_lightning as pl\n",
+    "from torch import nn\n",
+    "from torchmetrics.classification import Accuracy, F1Score, ConfusionMatrix\n",
+    "import seaborn as sns\n",
+    "import matplotlib.pyplot as plt\n",
+    "import pandas as pd\n",
+    "import numpy as np\n",
+    "\n",
+    "class EffNetV2_S(pl.LightningModule):\n",
+    "    \"\"\"A PyTorch Lightning Module for fine-tuning EfficientNetV2-S.\n",
+    "\n",
+    "    This module encapsulates the EfficientNetV2-S model and provides a flexible\n",
+    "    fine-tuning strategy. It can be configured for Stage 1 (training only the\n",
+    "    classifier and later feature blocks) or Stage 2 (training the entire model).\n",
+    "\n",
+    "    Args:\n",
+    "        lr (float, optional): The learning rate. Defaults to 1e-3.\n",
+    "        weight_decay (float, optional): Weight decay for the optimizer. Defaults to 1e-4.\n",
+    "        num_classes (int, optional): The number of output classes. Defaults to 101.\n",
+    "        class_names (list, optional): A list of class names for logging. Defaults to None.\n",
+    "        freeze_features (bool, optional): If True, freezes the backbone and unfreezes\n",
+    "            only the later blocks (Stage 1). If False, all features are trainable\n",
+    "            (Stage 2). Defaults to True.\n",
+    "        unfreeze_from_block (int, optional): Which feature block to start unfreezing\n",
+    "            from. Used only if freeze_features is True. Defaults to -3 (last 3 blocks).\n",
+    "    \"\"\"\n",
+    "    \n",
+    "    def __init__(\n",
+    "        self,\n",
+    "        lr: float = 1e-3,\n",
+    "        weight_decay: float = 1e-4,\n",
+    "        num_classes: int = 101,\n",
+    "        class_names: list = None,\n",
+    "        freeze_features: bool = True,         # True = Stage 1, False = Stage 2\n",
+    "        unfreeze_from_block: int = -3          # Only used if freeze_features=True\n",
+    "    ):\n",
+    "        super().__init__()\n",
+    "        self.save_hyperparameters()\n",
+    "        self.class_names = class_names if class_names else [str(i) for i in range(num_classes)]\n",
+    "\n",
+    "        # Load pretrained weights\n",
+    "        weights = torchvision.models.EfficientNet_V2_S_Weights.DEFAULT\n",
+    "        self.model = torchvision.models.efficientnet_v2_s(weights=weights)\n",
+    "\n",
+    "        # ---- Freezing strategy ----\n",
+    "        if freeze_features:\n",
+    "            # Freeze all first\n",
+    "            for param in self.model.parameters():\n",
+    "                param.requires_grad = False\n",
+    "            # Unfreeze from a specific block (default: last 3 blocks)\n",
+    "            for param in self.model.features[unfreeze_from_block:].parameters():\n",
+    "                param.requires_grad = True\n",
+    "        else:\n",
+    "            # Stage 2: unfreeze everything\n",
+    "            for param in self.model.parameters():\n",
+    "                param.requires_grad = True\n",
+    "\n",
+    "        # Classifier head\n",
+    "        self.model.classifier = nn.Sequential(\n",
+    "            nn.Dropout(p=0.2, inplace=True),\n",
+    "            nn.Linear(in_features=1280, out_features=self.hparams.num_classes, bias=True)\n",
+    "        )\n",
+    "\n",
+    "        # Loss & metrics\n",
+    "        self.loss_fn = nn.CrossEntropyLoss(label_smoothing=0.1)\n",
+    "        self.train_accuracy = Accuracy(task=\"multiclass\", num_classes=self.hparams.num_classes)\n",
+    "        self.val_accuracy = Accuracy(task=\"multiclass\", num_classes=self.hparams.num_classes)\n",
+    "        self.train_f1 = F1Score(task=\"multiclass\", num_classes=self.hparams.num_classes, average='macro')\n",
+    "        self.val_f1 = F1Score(task=\"multiclass\", num_classes=self.hparams.num_classes, average='macro')\n",
+    "        self.val_conf_matrix = ConfusionMatrix(task=\"multiclass\", num_classes=self.hparams.num_classes)\n",
+    "        self.test_conf_matrix = ConfusionMatrix(task=\"multiclass\", num_classes=self.hparams.num_classes)\n",
+    "\n",
+    "    def forward(self, x):\n",
+    "        return self.model(x)\n",
+    "\n",
+    "    def training_step(self, batch, batch_idx):\n",
+    "        x, y = batch\n",
+    "        logits = self(x)\n",
+    "        loss = self.loss_fn(logits, y)\n",
+    "        self.train_accuracy(logits, y)\n",
+    "        self.train_f1(logits, y)\n",
+    "        self.log('train_loss', loss, on_step=False, on_epoch=True, prog_bar=True)\n",
+    "        self.log('train_acc', self.train_accuracy, on_step=False, on_epoch=True, prog_bar=True)\n",
+    "        self.log('train_f1', self.train_f1, on_step=False, on_epoch=True, prog_bar=True)\n",
+    "        return loss\n",
+    "\n",
+    "    def validation_step(self, batch, batch_idx):\n",
+    "        x, y = batch\n",
+    "        logits = self(x)\n",
+    "        loss = self.loss_fn(logits, y)\n",
+    "        self.val_accuracy(logits, y)\n",
+    "        self.val_f1(logits, y)\n",
+    "        self.log('val_loss', loss, prog_bar=True)\n",
+    "        self.log('val_acc', self.val_accuracy, prog_bar=True)\n",
+    "        self.log('val_f1', self.val_f1, prog_bar=True)\n",
+    "        self.val_conf_matrix.update(logits, y)\n",
+    "\n",
+    "    def on_validation_epoch_end(self):\n",
+    "        cm = self.val_conf_matrix.compute()\n",
+    "        per_class_acc = cm.diag() / (cm.sum(dim=1) + 1e-6)\n",
+    "        print(\"\\n--- Per-Class Validation Accuracy ---\")\n",
+    "        for i, acc in enumerate(per_class_acc):\n",
+    "            self.log(f'val_acc/{self.class_names[i]}', acc.item(), on_epoch=True)\n",
+    "            print(f\"{self.class_names[i]:<20}: {acc.item():.4f}\")\n",
+    "        print(\"------------------------------------\")\n",
+    "        self.val_conf_matrix.reset()\n",
+    "\n",
+    "    def test_step(self, batch, batch_idx):\n",
+    "        x, y = batch\n",
+    "        logits = self(x)\n",
+    "        self.test_conf_matrix.update(logits, y)\n",
+    "\n",
+    "    def on_test_end(self):\n",
+    "        cm = self.test_conf_matrix.compute()\n",
+    "        print(\"\\nGenerating final confusion matrix plot...\")\n",
+    "        self.test_conf_matrix.reset()\n",
+    "\n",
+    "    def configure_optimizers(self):\n",
+    "        optimizer = torch.optim.Adam(\n",
+    "            self.parameters(),\n",
+    "            lr=self.hparams.lr,\n",
+    "            weight_decay=self.hparams.weight_decay\n",
+    "        )\n",
+    "        scheduler = torch.optim.lr_scheduler.CosineAnnealingLR(\n",
+    "            optimizer,\n",
+    "            T_max=self.trainer.max_epochs,\n",
+    "            eta_min=1e-6\n",
+    "        )\n",
+    "        return {\"optimizer\": optimizer, \"lr_scheduler\": {\"scheduler\": scheduler, \"interval\": \"epoch\"}}\n",
+    "    \n",
+    "class EffNetb2(pl.LightningModule):\n",
+    "    \"\"\"A PyTorch Lightning Module for fine-tuning EfficientNet-B2.\n",
+    "\n",
+    "    This module encapsulates the EfficientNet-B2 model and provides a flexible\n",
+    "    fine-tuning strategy. It can be configured for Stage 1 (training only the\n",
+    "    classifier and later feature blocks) or Stage 2 (training the entire model).\n",
+    "\n",
+    "    Args:\n",
+    "        lr (float, optional): The learning rate. Defaults to 1e-3.\n",
+    "        weight_decay (float, optional): Weight decay for the optimizer. Defaults to 1e-4.\n",
+    "        num_classes (int, optional): The number of output classes. Defaults to 101.\n",
+    "        class_names (list, optional): A list of class names for logging. Defaults to None.\n",
+    "        freeze_features (bool, optional): If True, freezes the backbone and unfreezes\n",
+    "            only the later blocks (Stage 1). If False, all features are trainable\n",
+    "            (Stage 2). Defaults to True.\n",
+    "        unfreeze_from_block (int, optional): Which feature block to start unfreezing\n",
+    "            from. Used only if freeze_features is True. Defaults to -3 (last 3 blocks).\n",
+    "    \"\"\"\n",
+    "\n",
+    "    def __init__(\n",
+    "        self,\n",
+    "        lr: float = 1e-3,\n",
+    "        weight_decay: float = 1e-4,\n",
+    "        num_classes: int = 101,\n",
+    "        class_names: list = None,\n",
+    "        freeze_features: bool = True,\n",
+    "        unfreeze_from_block: int = -3\n",
+    "    ):\n",
+    "        super().__init__()\n",
+    "        self.save_hyperparameters()\n",
+    "        self.class_names = class_names if class_names is not None else [str(i) for i in range(num_classes)]\n",
+    "\n",
+    "        # Model setup\n",
+    "        weights = torchvision.models.EfficientNet_B2_Weights.DEFAULT\n",
+    "        self.model = torchvision.models.efficientnet_b2(weights=weights)\n",
+    "        \n",
+    "        # --- : Flexible Freezing Strategy ---\n",
+    "        if self.hparams.freeze_features:\n",
+    "            # Stage 1: Freeze all first\n",
+    "            for param in self.model.parameters():\n",
+    "                param.requires_grad = False\n",
+    "            # Unfreeze from a specific block (default: last 3 blocks)\n",
+    "            for param in self.model.features[self.hparams.unfreeze_from_block:].parameters():\n",
+    "                param.requires_grad = True\n",
+    "        else:\n",
+    "            # Stage 2: unfreeze everything\n",
+    "            for param in self.model.parameters():\n",
+    "                param.requires_grad = True\n",
+    "\n",
+    "        # Classifier head\n",
+    "        self.model.classifier = nn.Sequential(\n",
+    "            nn.Dropout(p=0.3, inplace=True),\n",
+    "            nn.Linear(in_features=1408, out_features=self.hparams.num_classes)\n",
+    "        )\n",
+    "\n",
+    "        # Metrics\n",
+    "        self.loss_fn = nn.CrossEntropyLoss(label_smoothing=0.1)\n",
+    "        self.train_accuracy = Accuracy(task=\"multiclass\", num_classes=self.hparams.num_classes)\n",
+    "        self.val_accuracy = Accuracy(task=\"multiclass\", num_classes=self.hparams.num_classes)\n",
+    "        self.train_f1 = F1Score(task=\"multiclass\", num_classes=self.hparams.num_classes, average='macro')\n",
+    "        self.val_f1 = F1Score(task=\"multiclass\", num_classes=self.hparams.num_classes, average='macro')\n",
+    "        self.val_conf_matrix = ConfusionMatrix(task=\"multiclass\", num_classes=self.hparams.num_classes)\n",
+    "        self.test_conf_matrix = ConfusionMatrix(task=\"multiclass\", num_classes=self.hparams.num_classes)\n",
+    "\n",
+    "    def forward(self, x):\n",
+    "        return self.model(x)\n",
+    "\n",
+    "    def training_step(self, batch, batch_idx):\n",
+    "        x, y = batch\n",
+    "        logits = self(x)\n",
+    "        loss = self.loss_fn(logits, y)\n",
+    "        self.train_accuracy(logits, y)\n",
+    "        self.train_f1(logits, y)\n",
+    "        self.log('train_loss', loss, on_step=False, on_epoch=True, prog_bar=True)\n",
+    "        self.log('train_acc', self.train_accuracy, on_step=False, on_epoch=True, prog_bar=True)\n",
+    "        self.log('train_f1', self.train_f1, on_step=False, on_epoch=True, prog_bar=True)\n",
+    "        return loss\n",
+    "\n",
+    "    def validation_step(self, batch, batch_idx):\n",
+    "        x, y = batch\n",
+    "        logits = self(x)\n",
+    "        loss = self.loss_fn(logits, y)\n",
+    "        self.val_accuracy(logits, y)\n",
+    "        self.val_f1(logits, y)\n",
+    "        self.log('val_loss', loss, prog_bar=True)\n",
+    "        self.log('val_acc', self.val_accuracy, prog_bar=True)\n",
+    "        self.log('val_f1', self.val_f1, prog_bar=True)\n",
+    "        self.val_conf_matrix.update(logits, y)\n",
+    "\n",
+    "    def on_validation_epoch_end(self):\n",
+    "        cm = self.val_conf_matrix.compute()\n",
+    "\n",
+    "        # Add a small epsilon (1e-6) to the denominator for numerical stability.\n",
+    "        per_class_acc = cm.diag() / (cm.sum(dim=1) + 1e-6)\n",
+    "\n",
+    "        print(\"\\n--- Per-Class Validation Accuracy ---\")\n",
+    "        for i, acc in enumerate(per_class_acc):\n",
+    "            class_name = self.class_names[i]\n",
+    "            self.log(f'val_acc/{class_name}', acc.item(), on_epoch=True)\n",
+    "            print(f\"{class_name:<20}: {acc.item():.4f}\")\n",
+    "        print(\"------------------------------------\")\n",
+    "\n",
+    "        self.val_conf_matrix.reset()\n",
+    "\n",
+    "    def test_step(self, batch, batch_idx):\n",
+    "        x, y = batch\n",
+    "        logits = self(x)\n",
+    "        self.test_conf_matrix.update(logits, y)\n",
+    "\n",
+    "    def on_test_end(self):\n",
+    "        cm = self.test_conf_matrix.compute()\n",
+    "        print(\"\\nGenerating final confusion matrix plot...\")\n",
+    "        # Assuming plot_confusion_matrix is defined elsewhere\n",
+    "        # plot_confusion_matrix(cm.cpu().numpy(), class_names=self.class_names)\n",
+    "        self.test_conf_matrix.reset()\n",
+    "\n",
+    "    def configure_optimizers(self):\n",
+    "        optimizer = torch.optim.Adam(\n",
+    "            self.parameters(),\n",
+    "            lr=self.hparams.lr,\n",
+    "            weight_decay=self.hparams.weight_decay\n",
+    "        )\n",
+    "        scheduler = torch.optim.lr_scheduler.CosineAnnealingLR(\n",
+    "            optimizer,\n",
+    "            T_max=self.trainer.max_epochs,\n",
+    "            eta_min=1e-6\n",
+    "        )\n",
+    "        return {\n",
+    "            \"optimizer\": optimizer,\n",
+    "            \"lr_scheduler\": {\n",
+    "                \"scheduler\": scheduler,\n",
+    "                \"interval\": \"epoch\",\n",
+    "            },\n",
+    "        }\n"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "3f5bf233",
+   "metadata": {},
+   "source": [
+    "## 5. Training and plotting the Confusion Matrix"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "6b080afd",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "import pytorch_lightning as pl\n",
+    "from pytorch_lightning import Trainer, LightningModule\n",
+    "from pytorch_lightning.loggers import CSVLogger\n",
+    "from pytorch_lightning.callbacks import EarlyStopping ,ModelCheckpoint\n",
+    "from typing import Optional\n",
+    "import matplotlib.pyplot as plt\n",
+    "import seaborn as sns\n",
+    "import numpy as np\n",
+    "import pandas as pd\n",
+    "from typing import List\n",
+    "\n",
+    "DATA_DIR = \"data\"\n",
+    "MODEL_NAME = \"EfficientNet_V2_S\"\n",
+    "BATCH_SIZE = 32\n",
+    "SUBSET_FRACTION = 0.2 # Useing a smaller subset for quick testing\n",
+    "CHECKPOINT_PATH = \"checkpoints/best-model-epoch=22-val_acc=0.8541.ckpt\"  # Path to your trained model checkpoint\n",
+    "\n",
+    "def plot_confusion_matrix(cm: np.ndarray, class_names: List[str], figsize: tuple = (25, 25)):\n",
+    "    \"\"\"\n",
+    "    Creates and saves a multi-class confusion matrix plot.\n",
+    "\n",
+    "    This function normalizes the confusion matrix to show prediction\n",
+    "    percentages for each class, visualizes it as a heatmap, and saves\n",
+    "    the resulting figure to a file.\n",
+    "\n",
+    "    Args:\n",
+    "        cm (np.ndarray): The confusion matrix from torchmetrics or scikit-learn.\n",
+    "        class_names (List[str]): A list of class names for the labels.\n",
+    "        figsize (tuple, optional): The size of the figure. Defaults to (25, 25).\n",
+    "    \"\"\"\n",
+    "    # 1. Normalize the confusion matrix to show percentages\n",
+    "    # Add a small epsilon to prevent division by zero\n",
+    "    cm_normalized = cm.astype('float') / (cm.sum(axis=1)[:, np.newaxis] + 1e-6)\n",
+    "\n",
+    "    # 2. Create a DataFrame for a beautiful plot with labels\n",
+    "    df_cm = pd.DataFrame(cm_normalized, index=class_names, columns=class_names)\n",
+    "\n",
+    "    # 3. Create the plot\n",
+    "    plt.figure(figsize=figsize)\n",
+    "    heatmap = sns.heatmap(df_cm, annot=False, cmap='Blues') # Annotations off for 101 classes\n",
+    "\n",
+    "    # 4. Format the plot\n",
+    "    heatmap.yaxis.set_ticklabels(heatmap.yaxis.get_ticklabels(), rotation=0, ha='right', fontsize=8)\n",
+    "    heatmap.xaxis.set_ticklabels(heatmap.xaxis.get_ticklabels(), rotation=45, ha='right', fontsize=8)\n",
+    "\n",
+    "    plt.ylabel('True Label')\n",
+    "    plt.xlabel('Predicted Label')\n",
+    "    plt.title('Normalized Confusion Matrix')\n",
+    "    plt.tight_layout()\n",
+    "\n",
+    "    # 5. Save the figure and show the plot\n",
+    "    plt.savefig('confusion_matrix.png', dpi=300)\n",
+    "    print(\"Confusion matrix plot saved to confusion_matrix.png\")\n",
+    "    plt.show()\n",
+    "\n",
+    "def run_training_session(\n",
+    "    model_name: str = \"EfficientNet_V2_S\",\n",
+    "    batch_size: int = 32,\n",
+    "    data_dir: str = 'data',\n",
+    "    subset_fraction: float = 1.0,\n",
+    "    checkpoint_path: str = \"checkpoints/\",\n",
+    "    lr: float = 1e-3,\n",
+    "    weight_decay: float = 1e-4,\n",
+    "    freeze_features: bool = True,\n",
+    "    early_stopping_patience: int = 5,\n",
+    "    max_epochs: int = 100,\n",
+    "    accelerator: str = 'auto',\n",
+    "    resume_from_checkpoint: Optional[str] = None\n",
+    ") -> Trainer:\n",
+    "    \"\"\"\n",
+    "    Sets up and runs a complete training session for a specified model.\n",
+    "\n",
+    "    This function handles the entire pipeline: data preparation, model\n",
+    "    instantiation, logger and callback setup, and trainer execution.\n",
+    "\n",
+    "    Args:\n",
+    "        model_name (str): The name of the model architecture to train.\n",
+    "        batch_size (int): The number of samples per batch.\n",
+    "        data_dir (str): The root directory for the dataset.\n",
+    "        subset_fraction (float): The fraction of the dataset to use for training.\n",
+    "        checkpoint_path (str): Directory to save model checkpoints.\n",
+    "        lr (float): The learning rate for the optimizer.\n",
+    "        weight_decay (float): The weight decay for the optimizer.\n",
+    "        freeze_features (bool): Flag to control the fine-tuning strategy\n",
+    "            (e.g., for two-stage training).\n",
+    "        early_stopping_patience (int): Number of epochs with no improvement\n",
+    "            after which training will be stopped.\n",
+    "        max_epochs (int): The maximum number of epochs to train for.\n",
+    "        accelerator (str): The hardware accelerator to use ('auto', 'cpu', 'gpu').\n",
+    "        resume_from_checkpoint (Optional[str]): Path to a checkpoint file to\n",
+    "            resume training from. Defaults to None.\n",
+    "\n",
+    "    Returns:\n",
+    "        Trainer: The PyTorch Lightning Trainer object after fitting is complete.\n",
+    "    \"\"\"\n",
+    "    # A registry to map model names to their actual classes\n",
+    "    model_class_registry = {\n",
+    "        \"EfficientNet_V2_S\": EffNetV2_S,\n",
+    "        \"EfficientNet_B2\": EffNetb2,\n",
+    "    }\n",
+    "    if model_name not in model_class_registry:\n",
+    "        raise ValueError(f\"Model '{model_name}' is not a recognized class.\")\n",
+    "\n",
+    "    # Get model-specific transforms\n",
+    "    components = get_model_components(model_name)\n",
+    "    train_transforms = components[\"train_transforms\"]\n",
+    "    val_transforms = components[\"val_transforms\"]\n",
+    "\n",
+    "    # Set up the DataModule\n",
+    "    food_datamodule = Food101DataModule(\n",
+    "        data_dir=data_dir,\n",
+    "        batch_size=batch_size,\n",
+    "        train_transforms=train_transforms,\n",
+    "        val_transforms=val_transforms,\n",
+    "        subset_fraction=subset_fraction\n",
+    "    )\n",
+    "    food_datamodule.prepare_data()\n",
+    "    food_datamodule.setup()\n",
+    "\n",
+    "    # Instantiate the model dynamically\n",
+    "    model_class = model_class_registry[model_name]\n",
+    "    model = model_class(\n",
+    "        num_classes=len(food_datamodule.classes),\n",
+    "        class_names=food_datamodule.classes,\n",
+    "        lr=lr,\n",
+    "        weight_decay=weight_decay,\n",
+    "        freeze_features=freeze_features\n",
+    "    )\n",
+    "\n",
+    "    # Set up logger and callbacks\n",
+    "    logger = CSVLogger(save_dir=\"logs/\", name=model_name)\n",
+    "    \n",
+    "    early_stop_callback = EarlyStopping(\n",
+    "        monitor=\"val_loss\",\n",
+    "        patience=early_stopping_patience,\n",
+    "        mode=\"min\"\n",
+    "    )\n",
+    "    best_model_checkpoint = ModelCheckpoint(\n",
+    "        dirpath=checkpoint_path,\n",
+    "        filename=\"best-model-{epoch:02d}-{val_acc:.4f}\",\n",
+    "        save_top_k=1,\n",
+    "        monitor=\"val_acc\",\n",
+    "        mode=\"max\"\n",
+    "    )\n",
+    "    \n",
+    "    callbacks = [early_stop_callback, best_model_checkpoint]\n",
+    "\n",
+    "    # Instantiate the Trainer\n",
+    "    trainer = Trainer(\n",
+    "        max_epochs=max_epochs,\n",
+    "        accelerator=accelerator,\n",
+    "        callbacks=callbacks,\n",
+    "        logger=logger,\n",
+    "    )\n",
+    "\n",
+    "    # Start training\n",
+    "    trainer.fit(\n",
+    "        model,\n",
+    "        datamodule=food_datamodule,\n",
+    "        ckpt_path=resume_from_checkpoint \n",
+    "    )\n",
+    "    \n",
+    "    return trainer\n"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "04c534dc",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# --- 1. DEFINE YOUR TRAINING CONFIGURATION HERE ---\n",
+    "config = {\n",
+    "    \"model_name\": \"EfficientNet_V2_S\",\n",
+    "    \"batch_size\": 32,\n",
+    "    \"lr\": 1e-4,\n",
+    "    \"epochs\": 50,\n",
+    "    \"subset_fraction\": 1.0,  # Use 1.0 for the full dataset\n",
+    "    \"freeze_features\": True,\n",
+    "    \"early_stopping_patience\": 10\n",
+    "}\n",
+    "\n",
+    "# --- 2. PRINT CONFIGURATION AND START TRAINING ---\n",
+    "print(\"--- Starting Training Session ---\")\n",
+    "for key, value in config.items():\n",
+    "    print(f\"  {key}: {value}\")\n",
+    "print(\"---------------------------------\")\n",
+    "\n",
+    "run_training_session(\n",
+    "    model_name=config[\"model_name\"],\n",
+    "    batch_size=config[\"batch_size\"],\n",
+    "    lr=config[\"lr\"],\n",
+    "    max_epochs=config[\"epochs\"],\n",
+    "    subset_fraction=config[\"subset_fraction\"],\n",
+    "    freeze_features=config[\"freeze_features\"],\n",
+    "    early_stopping_patience=config[\"early_stopping_patience\"]\n",
+    ")\n",
+    "\n",
+    "print(\"\\n--- Training Session Complete ---\")\n",
+    "\n",
+    "print(\"\\n--- Starting Evaluation on Test Set ---\")\n",
+    "\n",
+    "print(f\"Loading model from checkpoint: {CHECKPOINT_PATH}\")\n",
+    "\n",
+    "# Step 1: Set up the DataModule for the test set\n",
+    "components = get_model_components(MODEL_NAME)\n",
+    "val_transforms = components[\"val_transforms\"]\n",
+    "\n",
+    "datamodule = Food101DataModule(\n",
+    "    data_dir=DATA_DIR,\n",
+    "    batch_size=BATCH_SIZE,\n",
+    "    val_transforms=val_transforms\n",
+    ")\n",
+    "# This prepares the test dataloader specifically\n",
+    "datamodule.setup(stage='test')\n",
+    "\n",
+    "# Step 2: Load the trained model from the checkpoint file\n",
+    "model = EffNetV2_S.load_from_checkpoint(CHECKPOINT_PATH)\n",
+    "model.class_names = datamodule.classes\n",
+    "model.eval() # Set the model to evaluation mode\n",
+    "\n",
+    "# Step 3: Create a Trainer and run the test\n",
+    "trainer = pl.Trainer(accelerator='auto')\n",
+    "\n",
+    "# This call will run the test_step and automatically trigger the \n",
+    "# on_test_end hook in your model, which generates the plot.\n",
+    "trainer.test(model, datamodule=datamodule)\n",
+    "\n",
+    "print(\"\\nEvaluation complete. The confusion matrix plot has been saved.\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "2325adef",
+   "metadata": {},
+   "source": [
+    "## 6. Local Gradio Demo"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "44decdea",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "FOOD101_CLASSES = [\n",
+    "    'apple_pie', 'baby_back_ribs', 'baklava', 'beef_carpaccio', 'beef_tartare', \n",
+    "    'beet_salad', 'beignets', 'bibimbap', 'bread_pudding', 'breakfast_burrito', \n",
+    "    'bruschetta', 'caesar_salad', 'cannoli', 'caprese_salad', 'carrot_cake', \n",
+    "    'ceviche', 'cheesecake', 'cheese_plate', 'chicken_curry', 'chicken_quesadilla', \n",
+    "    'chicken_wings', 'chocolate_cake', 'chocolate_mousse', 'churros', \n",
+    "    'clam_chowder', 'club_sandwich', 'crab_cakes', 'creme_brulee', 'croque_madame', \n",
+    "    'cup_cakes', 'deviled_eggs', 'donuts', 'dumplings', 'edamame', 'eggs_benedict', \n",
+    "    'escargots', 'falafel', 'filet_mignon', 'fish_and_chips', 'foie_gras', \n",
+    "    'french_fries', 'french_onion_soup', 'french_toast', 'fried_calamari', \n",
+    "    'fried_rice', 'frozen_yogurt', 'garlic_bread', 'gnocchi', 'greek_salad', \n",
+    "    'grilled_cheese_sandwich', 'grilled_salmon', 'guacamole', 'gyoza', 'hamburger', \n",
+    "    'hot_and_sour_soup', 'hot_dog', 'huevos_rancheros', 'hummus', 'ice_cream', \n",
+    "    'lasagna', 'lobster_bisque', 'lobster_roll_sandwich', 'macaroni_and_cheese', \n",
+    "    'macarons', 'miso_soup', 'mussels', 'nachos', 'omelette', 'onion_rings', \n",
+    "    'oysters', 'pad_thai', 'paella', 'pancakes', 'panna_cotta', 'peking_duck', \n",
+    "    'pho', 'pizza', 'pork_chop', 'poutine', 'prime_rib', 'pulled_pork_sandwich', \n",
+    "    'ramen', 'ravioli', 'red_velvet_cake', 'risotto', 'samosa', 'sashimi', \n",
+    "    'scallops', 'seaweed_salad', 'shrimp_and_grits', 'spaghetti_bolognese', \n",
+    "    'spaghetti_carbonara', 'spring_rolls', 'steak', 'strawberry_shortcake', \n",
+    "    'sushi', 'tacos', 'takoyaki', 'tiramisu', 'tuna_tartare', 'waffles'\n",
+    "]"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "10bdf9fd",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "import gradio as gr\n",
+    "import torch\n",
+    "from gradio.themes.base import Base\n",
+    "from torchvision.datasets import Food101\n",
+    "\n",
+    "# --- 1. Configuration ---\n",
+    "MODEL_PATH = \"checkpoints/best-model-epoch=22-val_acc=0.8541.ckpt\" \n",
+    "MODEL_NAME = \"EfficientNet_V2_S\"\n",
+    "\n",
+    "theme = gr.themes.Soft(\n",
+    "    primary_hue=\"orange\",\n",
+    "    secondary_hue=\"blue\",\n",
+    ").set(\n",
+    "\n",
+    "    body_background_fill=\"#f2f2f2\"\n",
+    ")\n",
+    "\n",
+    "# --- 2. Load Model and Assets ---\n",
+    "print(\"Loading model and assets...\")\n",
+    "model = EffNetV2_S.load_from_checkpoint(MODEL_PATH)\n",
+    "model.eval()\n",
+    "\n",
+    "components = get_model_components(MODEL_NAME)\n",
+    "transforms = components[\"val_transforms\"]\n",
+    "class_names = FOOD101_CLASSES \n",
+    "\n",
+    "print(\"Model and assets loaded successfully.\")\n",
+    "\n",
+    "# --- 3. Prediction Function ---\n",
+    "def predict(image):\n",
+    "    \"\"\"\n",
+    "    Takes a PIL image, preprocesses it, and returns the model's top 3 predictions.\n",
+    "    \"\"\"\n",
+    "    # 1. Preprocess the image and add a batch dimension\n",
+    "    input_tensor = transforms(image).unsqueeze(0)\n",
+    "    \n",
+    "    # 2. Move the input tensor to the same device as the model\n",
+    "    # This ensures both the model and the data are on the GPU.\n",
+    "    device = next(model.parameters()).device\n",
+    "    input_tensor = input_tensor.to(device)\n",
+    "    \n",
+    "    # 3. Make a prediction\n",
+    "    with torch.no_grad():\n",
+    "        output = model(input_tensor)\n",
+    "        \n",
+    "    # 4. Post-process the output\n",
+    "    probabilities = torch.nn.functional.softmax(output[0], dim=0)\n",
+    "    confidences = {class_names[i]: float(probabilities[i]) for i in range(len(class_names))}\n",
+    "    \n",
+    "    return confidences\n",
+    "    \n",
+    "\n",
+    "demo = gr.Interface(\n",
+    "    fn=predict,\n",
+    "    inputs=gr.Image(type=\"pil\", label=\"Upload a Food Image\"),\n",
+    "    outputs=gr.Label(num_top_classes=3, label=\"Top Predictions\"),\n",
+    "    theme=theme,\n",
+    "    \n",
+    "    # UI Enhancements\n",
+    "    title=\"🍔 Food-101 Image Classifier 🍟\",\n",
+    "    description=(\n",
+    "        \"What's on your plate? Upload an image or try one of the examples below to classify it. \"\n",
+    "        \"This demo uses an EfficientNetV2-S model fine-tuned on the Food-101 dataset.\"\n",
+    "    ),\n",
+    "    article=(\n",
+    "        \"<p style='text-align: center;'>A project by Daniel Kiani. \"\n",
+    "        \"<a href='https://github.com/Deathshot78/Food101-Classification' target='_blank'>Check out the code on GitHub!</a></p>\"\n",
+    "    ),\n",
+    "    examples=[\n",
+    "        [\"assets/ramen.jpg\"],\n",
+    "        [\"assets/pizza.jpg\"],\n",
+    "        [\"assets/oysters.jpg\"],\n",
+    "        [\"assets/onion_rings.jpg\"]\n",
+    "    ]\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "b536610d",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# Launch the Gradio app locally\n",
+    "demo.launch()"
+   ]
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.10.6"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 5
+}

requirements.txt ADDED Viewed

Binary file (320 Bytes). View file

scripts/app.py ADDED Viewed

	@@ -0,0 +1,82 @@

+import gradio as gr
+import torch
+from gradio.themes.base import Base
+from torchvision.datasets import Food101
+from models import EffNetV2_S
+from prepare_data import get_model_components
+from class_names import FOOD101_CLASSES
+# --- 1. Configuration ---
+MODEL_PATH = "checkpoints/best-model-epoch=22-val_acc=0.8541.ckpt"
+MODEL_NAME = "EfficientNet_V2_S"
+theme = gr.themes.Soft(
+    primary_hue="orange",
+    secondary_hue="blue",
+).set(
+    body_background_fill="#f2f2f2"
+)
+# --- 2. Load Model and Assets ---
+print("Loading model and assets...")
+model = EffNetV2_S.load_from_checkpoint(MODEL_PATH)
+model.eval()
+components = get_model_components(MODEL_NAME)
+transforms = components["val_transforms"]
+class_names = FOOD101_CLASSES
+print("Model and assets loaded successfully.")
+# --- 3. Prediction Function ---
+def predict(image):
+    """
+    Takes a PIL image, preprocesses it, and returns the model's top 3 predictions.
+    """
+    # 1. Preprocess the image and add a batch dimension
+    input_tensor = transforms(image).unsqueeze(0)
+    # 2. Move the input tensor to the same device as the model
+    # This ensures both the model and the data are on the GPU.
+    device = next(model.parameters()).device
+    input_tensor = input_tensor.to(device)
+    # 3. Make a prediction
+    with torch.no_grad():
+        output = model(input_tensor)
+    # 4. Post-process the output
+    probabilities = torch.nn.functional.softmax(output[0], dim=0)
+    confidences = {class_names[i]: float(probabilities[i]) for i in range(len(class_names))}
+    return confidences
+demo = gr.Interface(
+    fn=predict,
+    inputs=gr.Image(type="pil", label="Upload a Food Image"),
+    outputs=gr.Label(num_top_classes=3, label="Top Predictions"),
+    theme=theme,
+    # UI Enhancements
+    title="🍔 Food-101 Image Classifier 🍟",
+    description=(
+        "What's on your plate? Upload an image or try one of the examples below to classify it. "
+        "This demo uses an EfficientNetV2-S model fine-tuned on the Food-101 dataset."
+    ),
+    article=(
+        "<p style='text-align: center;'>A project by Daniel Kiani. "
+        "<a href='https://github.com/Deathshot78/Food101-Classification' target='_blank'>Check out the code on GitHub!</a></p>"
+    ),
+    examples=[
+        ["assets/ramen.jpg"],
+        ["assets/pizza.jpg"],
+        ["assets/oysters.jpg"],
+        ["assets/onion_rings.jpg"]
+    ]
+)
+# --- 5. Launch the App ---
+if __name__ == "__main__":
+    demo.launch(debug=True)

scripts/class_names.py ADDED Viewed

	@@ -0,0 +1,22 @@

+FOOD101_CLASSES = [
+    'apple_pie', 'baby_back_ribs', 'baklava', 'beef_carpaccio', 'beef_tartare',
+    'beet_salad', 'beignets', 'bibimbap', 'bread_pudding', 'breakfast_burrito',
+    'bruschetta', 'caesar_salad', 'cannoli', 'caprese_salad', 'carrot_cake',
+    'ceviche', 'cheesecake', 'cheese_plate', 'chicken_curry', 'chicken_quesadilla',
+    'chicken_wings', 'chocolate_cake', 'chocolate_mousse', 'churros',
+    'clam_chowder', 'club_sandwich', 'crab_cakes', 'creme_brulee', 'croque_madame',
+    'cup_cakes', 'deviled_eggs', 'donuts', 'dumplings', 'edamame', 'eggs_benedict',
+    'escargots', 'falafel', 'filet_mignon', 'fish_and_chips', 'foie_gras',
+    'french_fries', 'french_onion_soup', 'french_toast', 'fried_calamari',
+    'fried_rice', 'frozen_yogurt', 'garlic_bread', 'gnocchi', 'greek_salad',
+    'grilled_cheese_sandwich', 'grilled_salmon', 'guacamole', 'gyoza', 'hamburger',
+    'hot_and_sour_soup', 'hot_dog', 'huevos_rancheros', 'hummus', 'ice_cream',
+    'lasagna', 'lobster_bisque', 'lobster_roll_sandwich', 'macaroni_and_cheese',
+    'macarons', 'miso_soup', 'mussels', 'nachos', 'omelette', 'onion_rings',
+    'oysters', 'pad_thai', 'paella', 'pancakes', 'panna_cotta', 'peking_duck',
+    'pho', 'pizza', 'pork_chop', 'poutine', 'prime_rib', 'pulled_pork_sandwich',
+    'ramen', 'ravioli', 'red_velvet_cake', 'risotto', 'samosa', 'sashimi',
+    'scallops', 'seaweed_salad', 'shrimp_and_grits', 'spaghetti_bolognese',
+    'spaghetti_carbonara', 'spring_rolls', 'steak', 'strawberry_shortcake',
+    'sushi', 'tacos', 'takoyaki', 'tiramisu', 'tuna_tartare', 'waffles'
+]

scripts/main.py ADDED Viewed

	@@ -0,0 +1,229 @@

+from prepare_data import Food101DataModule, CustomFood101, get_model_components
+from models import EffNetV2_S , EffNetb2
+import pytorch_lightning as pl
+from pytorch_lightning import Trainer, LightningModule
+from pytorch_lightning.loggers import CSVLogger
+from pytorch_lightning.callbacks import EarlyStopping ,ModelCheckpoint
+from typing import Optional
+import matplotlib.pyplot as plt
+import seaborn as sns
+import numpy as np
+import pandas as pd
+from typing import List
+DATA_DIR = "data"
+MODEL_NAME = "EfficientNet_V2_S"
+BATCH_SIZE = 32
+SUBSET_FRACTION = 0.2 # Useing a smaller subset for quick testing
+CHECKPOINT_PATH = "checkpoints/best-model-epoch=22-val_acc=0.8541.ckpt"  # Path to your trained model checkpoint
+def plot_confusion_matrix(cm: np.ndarray, class_names: List[str], figsize: tuple = (25, 25)):
+    """
+    Creates and saves a multi-class confusion matrix plot.
+    This function normalizes the confusion matrix to show prediction
+    percentages for each class, visualizes it as a heatmap, and saves
+    the resulting figure to a file.
+    Args:
+        cm (np.ndarray): The confusion matrix from torchmetrics or scikit-learn.
+        class_names (List[str]): A list of class names for the labels.
+        figsize (tuple, optional): The size of the figure. Defaults to (25, 25).
+    """
+    # 1. Normalize the confusion matrix to show percentages
+    # Add a small epsilon to prevent division by zero
+    cm_normalized = cm.astype('float') / (cm.sum(axis=1)[:, np.newaxis] + 1e-6)
+    # 2. Create a DataFrame for a beautiful plot with labels
+    df_cm = pd.DataFrame(cm_normalized, index=class_names, columns=class_names)
+    # 3. Create the plot
+    plt.figure(figsize=figsize)
+    heatmap = sns.heatmap(df_cm, annot=False, cmap='Blues') # Annotations off for 101 classes
+    # 4. Format the plot
+    heatmap.yaxis.set_ticklabels(heatmap.yaxis.get_ticklabels(), rotation=0, ha='right', fontsize=8)
+    heatmap.xaxis.set_ticklabels(heatmap.xaxis.get_ticklabels(), rotation=45, ha='right', fontsize=8)
+    plt.ylabel('True Label')
+    plt.xlabel('Predicted Label')
+    plt.title('Normalized Confusion Matrix')
+    plt.tight_layout()
+    # 5. Save the figure and show the plot
+    plt.savefig('confusion_matrix.png', dpi=300)
+    print("Confusion matrix plot saved to confusion_matrix.png")
+    plt.show()
+def run_training_session(
+    model_name: str = "EfficientNet_V2_S",
+    batch_size: int = 32,
+    data_dir: str = 'data',
+    subset_fraction: float = 1.0,
+    checkpoint_path: str = "checkpoints/",
+    lr: float = 1e-3,
+    weight_decay: float = 1e-4,
+    freeze_features: bool = True,
+    early_stopping_patience: int = 5,
+    max_epochs: int = 100,
+    accelerator: str = 'auto',
+    resume_from_checkpoint: Optional[str] = None
+) -> Trainer:
+    """
+    Sets up and runs a complete training session for a specified model.
+    This function handles the entire pipeline: data preparation, model
+    instantiation, logger and callback setup, and trainer execution.
+    Args:
+        model_name (str): The name of the model architecture to train.
+        batch_size (int): The number of samples per batch.
+        data_dir (str): The root directory for the dataset.
+        subset_fraction (float): The fraction of the dataset to use for training.
+        checkpoint_path (str): Directory to save model checkpoints.
+        lr (float): The learning rate for the optimizer.
+        weight_decay (float): The weight decay for the optimizer.
+        freeze_features (bool): Flag to control the fine-tuning strategy
+            (e.g., for two-stage training).
+        early_stopping_patience (int): Number of epochs with no improvement
+            after which training will be stopped.
+        max_epochs (int): The maximum number of epochs to train for.
+        accelerator (str): The hardware accelerator to use ('auto', 'cpu', 'gpu').
+        resume_from_checkpoint (Optional[str]): Path to a checkpoint file to
+            resume training from. Defaults to None.
+    Returns:
+        Trainer: The PyTorch Lightning Trainer object after fitting is complete.
+    """
+    # A registry to map model names to their actual classes
+    model_class_registry = {
+        "EfficientNet_V2_S": EffNetV2_S,
+        "EfficientNet_B2": EffNetb2,
+    }
+    if model_name not in model_class_registry:
+        raise ValueError(f"Model '{model_name}' is not a recognized class.")
+    # Get model-specific transforms
+    components = get_model_components(model_name)
+    train_transforms = components["train_transforms"]
+    val_transforms = components["val_transforms"]
+    # Set up the DataModule
+    food_datamodule = Food101DataModule(
+        data_dir=data_dir,
+        batch_size=batch_size,
+        train_transforms=train_transforms,
+        val_transforms=val_transforms,
+        subset_fraction=subset_fraction
+    )
+    food_datamodule.prepare_data()
+    food_datamodule.setup()
+    # Instantiate the model dynamically
+    model_class = model_class_registry[model_name]
+    model = model_class(
+        num_classes=len(food_datamodule.classes),
+        class_names=food_datamodule.classes,
+        lr=lr,
+        weight_decay=weight_decay,
+        freeze_features=freeze_features
+    )
+    # Set up logger and callbacks
+    logger = CSVLogger(save_dir="logs/", name=model_name)
+    early_stop_callback = EarlyStopping(
+        monitor="val_loss",
+        patience=early_stopping_patience,
+        mode="min"
+    )
+    best_model_checkpoint = ModelCheckpoint(
+        dirpath=checkpoint_path,
+        filename="best-model-{epoch:02d}-{val_acc:.4f}",
+        save_top_k=1,
+        monitor="val_acc",
+        mode="max"
+    )
+    callbacks = [early_stop_callback, best_model_checkpoint]
+    # Instantiate the Trainer
+    trainer = Trainer(
+        max_epochs=max_epochs,
+        accelerator=accelerator,
+        callbacks=callbacks,
+        logger=logger,
+    )
+    # Start training
+    trainer.fit(
+        model,
+        datamodule=food_datamodule,
+        ckpt_path=resume_from_checkpoint
+    )
+    return trainer
+# ===================================================================
+# Main Execution Block
+# ===================================================================
+if __name__ == "__main__":
+    # --- 1. DEFINE YOUR TRAINING CONFIGURATION HERE ---
+    config = {
+        "model_name": "EfficientNet_V2_S",
+        "batch_size": 32,
+        "lr": 1e-4,
+        "epochs": 50,
+        "subset_fraction": 1.0,  # Use 1.0 for the full dataset
+        "freeze_features": True,
+        "early_stopping_patience": 10
+    }
+    # --- 2. PRINT CONFIGURATION AND START TRAINING ---
+    print("--- Starting Training Session ---")
+    for key, value in config.items():
+        print(f"  {key}: {value}")
+    print("---------------------------------")
+    run_training_session(
+        model_name=config["model_name"],
+        batch_size=config["batch_size"],
+        lr=config["lr"],
+        max_epochs=config["epochs"],
+        subset_fraction=config["subset_fraction"],
+        freeze_features=config["freeze_features"],
+        early_stopping_patience=config["early_stopping_patience"]
+    )
+    print("\n--- Training Session Complete ---")
+    print("\n--- Starting Evaluation on Test Set ---")
+    print(f"Loading model from checkpoint: {CHECKPOINT_PATH}")
+    # Step 1: Set up the DataModule for the test set
+    components = get_model_components(MODEL_NAME)
+    val_transforms = components["val_transforms"]
+    datamodule = Food101DataModule(
+        data_dir=DATA_DIR,
+        batch_size=BATCH_SIZE,
+        val_transforms=val_transforms
+    )
+    # This prepares the test dataloader specifically
+    datamodule.setup(stage='test')
+    # Step 2: Load the trained model from the checkpoint file
+    model = EffNetV2_S.load_from_checkpoint(CHECKPOINT_PATH)
+    model.class_names = datamodule.classes
+    model.eval() # Set the model to evaluation mode
+    # Step 3: Create a Trainer and run the test
+    trainer = pl.Trainer(accelerator='auto')
+    # This call will run the test_step and automatically trigger the
+    # on_test_end hook in your model, which generates the plot.
+    trainer.test(model, datamodule=datamodule)
+    print("\nEvaluation complete. The confusion matrix plot has been saved.")

scripts/models.py ADDED Viewed

	@@ -0,0 +1,266 @@

+import torch
+import torchvision
+import pytorch_lightning as pl
+from torch import nn
+from torchmetrics.classification import Accuracy, F1Score, ConfusionMatrix
+import seaborn as sns
+import matplotlib.pyplot as plt
+import pandas as pd
+import numpy as np
+class EffNetV2_S(pl.LightningModule):
+    """A PyTorch Lightning Module for fine-tuning EfficientNetV2-S.
+    This module encapsulates the EfficientNetV2-S model and provides a flexible
+    fine-tuning strategy. It can be configured for Stage 1 (training only the
+    classifier and later feature blocks) or Stage 2 (training the entire model).
+    Args:
+        lr (float, optional): The learning rate. Defaults to 1e-3.
+        weight_decay (float, optional): Weight decay for the optimizer. Defaults to 1e-4.
+        num_classes (int, optional): The number of output classes. Defaults to 101.
+        class_names (list, optional): A list of class names for logging. Defaults to None.
+        freeze_features (bool, optional): If True, freezes the backbone and unfreezes
+            only the later blocks (Stage 1). If False, all features are trainable
+            (Stage 2). Defaults to True.
+        unfreeze_from_block (int, optional): Which feature block to start unfreezing
+            from. Used only if freeze_features is True. Defaults to -3 (last 3 blocks).
+    """
+    def __init__(
+        self,
+        lr: float = 1e-3,
+        weight_decay: float = 1e-4,
+        num_classes: int = 101,
+        class_names: list = None,
+        freeze_features: bool = True,         # True = Stage 1, False = Stage 2
+        unfreeze_from_block: int = -3          # Only used if freeze_features=True
+    ):
+        super().__init__()
+        self.save_hyperparameters()
+        self.class_names = class_names if class_names else [str(i) for i in range(num_classes)]
+        # Load pretrained weights
+        weights = torchvision.models.EfficientNet_V2_S_Weights.DEFAULT
+        self.model = torchvision.models.efficientnet_v2_s(weights=weights)
+        # ---- Freezing strategy ----
+        if freeze_features:
+            # Freeze all first
+            for param in self.model.parameters():
+                param.requires_grad = False
+            # Unfreeze from a specific block (default: last 3 blocks)
+            for param in self.model.features[unfreeze_from_block:].parameters():
+                param.requires_grad = True
+        else:
+            # Stage 2: unfreeze everything
+            for param in self.model.parameters():
+                param.requires_grad = True
+        # Classifier head
+        self.model.classifier = nn.Sequential(
+            nn.Dropout(p=0.2, inplace=True),
+            nn.Linear(in_features=1280, out_features=self.hparams.num_classes, bias=True)
+        )
+        # Loss & metrics
+        self.loss_fn = nn.CrossEntropyLoss(label_smoothing=0.1)
+        self.train_accuracy = Accuracy(task="multiclass", num_classes=self.hparams.num_classes)
+        self.val_accuracy = Accuracy(task="multiclass", num_classes=self.hparams.num_classes)
+        self.train_f1 = F1Score(task="multiclass", num_classes=self.hparams.num_classes, average='macro')
+        self.val_f1 = F1Score(task="multiclass", num_classes=self.hparams.num_classes, average='macro')
+        self.val_conf_matrix = ConfusionMatrix(task="multiclass", num_classes=self.hparams.num_classes)
+        self.test_conf_matrix = ConfusionMatrix(task="multiclass", num_classes=self.hparams.num_classes)
+    def forward(self, x):
+        return self.model(x)
+    def training_step(self, batch, batch_idx):
+        x, y = batch
+        logits = self(x)
+        loss = self.loss_fn(logits, y)
+        self.train_accuracy(logits, y)
+        self.train_f1(logits, y)
+        self.log('train_loss', loss, on_step=False, on_epoch=True, prog_bar=True)
+        self.log('train_acc', self.train_accuracy, on_step=False, on_epoch=True, prog_bar=True)
+        self.log('train_f1', self.train_f1, on_step=False, on_epoch=True, prog_bar=True)
+        return loss
+    def validation_step(self, batch, batch_idx):
+        x, y = batch
+        logits = self(x)
+        loss = self.loss_fn(logits, y)
+        self.val_accuracy(logits, y)
+        self.val_f1(logits, y)
+        self.log('val_loss', loss, prog_bar=True)
+        self.log('val_acc', self.val_accuracy, prog_bar=True)
+        self.log('val_f1', self.val_f1, prog_bar=True)
+        self.val_conf_matrix.update(logits, y)
+    def on_validation_epoch_end(self):
+        cm = self.val_conf_matrix.compute()
+        per_class_acc = cm.diag() / (cm.sum(dim=1) + 1e-6)
+        print("\n--- Per-Class Validation Accuracy ---")
+        for i, acc in enumerate(per_class_acc):
+            self.log(f'val_acc/{self.class_names[i]}', acc.item(), on_epoch=True)
+            print(f"{self.class_names[i]:<20}: {acc.item():.4f}")
+        print("------------------------------------")
+        self.val_conf_matrix.reset()
+    def test_step(self, batch, batch_idx):
+        x, y = batch
+        logits = self(x)
+        self.test_conf_matrix.update(logits, y)
+    def on_test_end(self):
+        cm = self.test_conf_matrix.compute()
+        print("\nGenerating final confusion matrix plot...")
+        self.test_conf_matrix.reset()
+    def configure_optimizers(self):
+        optimizer = torch.optim.Adam(
+            self.parameters(),
+            lr=self.hparams.lr,
+            weight_decay=self.hparams.weight_decay
+        )
+        scheduler = torch.optim.lr_scheduler.CosineAnnealingLR(
+            optimizer,
+            T_max=self.trainer.max_epochs,
+            eta_min=1e-6
+        )
+        return {"optimizer": optimizer, "lr_scheduler": {"scheduler": scheduler, "interval": "epoch"}}
+class EffNetb2(pl.LightningModule):
+    """A PyTorch Lightning Module for fine-tuning EfficientNet-B2.
+    This module encapsulates the EfficientNet-B2 model and provides a flexible
+    fine-tuning strategy. It can be configured for Stage 1 (training only the
+    classifier and later feature blocks) or Stage 2 (training the entire model).
+    Args:
+        lr (float, optional): The learning rate. Defaults to 1e-3.
+        weight_decay (float, optional): Weight decay for the optimizer. Defaults to 1e-4.
+        num_classes (int, optional): The number of output classes. Defaults to 101.
+        class_names (list, optional): A list of class names for logging. Defaults to None.
+        freeze_features (bool, optional): If True, freezes the backbone and unfreezes
+            only the later blocks (Stage 1). If False, all features are trainable
+            (Stage 2). Defaults to True.
+        unfreeze_from_block (int, optional): Which feature block to start unfreezing
+            from. Used only if freeze_features is True. Defaults to -3 (last 3 blocks).
+    """
+    def __init__(
+        self,
+        lr: float = 1e-3,
+        weight_decay: float = 1e-4,
+        num_classes: int = 101,
+        class_names: list = None,
+        freeze_features: bool = True,
+        unfreeze_from_block: int = -3
+    ):
+        super().__init__()
+        self.save_hyperparameters()
+        self.class_names = class_names if class_names is not None else [str(i) for i in range(num_classes)]
+        # Model setup
+        weights = torchvision.models.EfficientNet_B2_Weights.DEFAULT
+        self.model = torchvision.models.efficientnet_b2(weights=weights)
+        # --- : Flexible Freezing Strategy ---
+        if self.hparams.freeze_features:
+            # Stage 1: Freeze all first
+            for param in self.model.parameters():
+                param.requires_grad = False
+            # Unfreeze from a specific block (default: last 3 blocks)
+            for param in self.model.features[self.hparams.unfreeze_from_block:].parameters():
+                param.requires_grad = True
+        else:
+            # Stage 2: unfreeze everything
+            for param in self.model.parameters():
+                param.requires_grad = True
+        # Classifier head
+        self.model.classifier = nn.Sequential(
+            nn.Dropout(p=0.3, inplace=True),
+            nn.Linear(in_features=1408, out_features=self.hparams.num_classes)
+        )
+        # Metrics
+        self.loss_fn = nn.CrossEntropyLoss(label_smoothing=0.1)
+        self.train_accuracy = Accuracy(task="multiclass", num_classes=self.hparams.num_classes)
+        self.val_accuracy = Accuracy(task="multiclass", num_classes=self.hparams.num_classes)
+        self.train_f1 = F1Score(task="multiclass", num_classes=self.hparams.num_classes, average='macro')
+        self.val_f1 = F1Score(task="multiclass", num_classes=self.hparams.num_classes, average='macro')
+        self.val_conf_matrix = ConfusionMatrix(task="multiclass", num_classes=self.hparams.num_classes)
+        self.test_conf_matrix = ConfusionMatrix(task="multiclass", num_classes=self.hparams.num_classes)
+    def forward(self, x):
+        return self.model(x)
+    def training_step(self, batch, batch_idx):
+        x, y = batch
+        logits = self(x)
+        loss = self.loss_fn(logits, y)
+        self.train_accuracy(logits, y)
+        self.train_f1(logits, y)
+        self.log('train_loss', loss, on_step=False, on_epoch=True, prog_bar=True)
+        self.log('train_acc', self.train_accuracy, on_step=False, on_epoch=True, prog_bar=True)
+        self.log('train_f1', self.train_f1, on_step=False, on_epoch=True, prog_bar=True)
+        return loss
+    def validation_step(self, batch, batch_idx):
+        x, y = batch
+        logits = self(x)
+        loss = self.loss_fn(logits, y)
+        self.val_accuracy(logits, y)
+        self.val_f1(logits, y)
+        self.log('val_loss', loss, prog_bar=True)
+        self.log('val_acc', self.val_accuracy, prog_bar=True)
+        self.log('val_f1', self.val_f1, prog_bar=True)
+        self.val_conf_matrix.update(logits, y)
+    def on_validation_epoch_end(self):
+        cm = self.val_conf_matrix.compute()
+        # Add a small epsilon (1e-6) to the denominator for numerical stability.
+        per_class_acc = cm.diag() / (cm.sum(dim=1) + 1e-6)
+        print("\n--- Per-Class Validation Accuracy ---")
+        for i, acc in enumerate(per_class_acc):
+            class_name = self.class_names[i]
+            self.log(f'val_acc/{class_name}', acc.item(), on_epoch=True)
+            print(f"{class_name:<20}: {acc.item():.4f}")
+        print("------------------------------------")
+        self.val_conf_matrix.reset()
+    def test_step(self, batch, batch_idx):
+        x, y = batch
+        logits = self(x)
+        self.test_conf_matrix.update(logits, y)
+    def on_test_end(self):
+        cm = self.test_conf_matrix.compute()
+        print("\nGenerating final confusion matrix plot...")
+        # Assuming plot_confusion_matrix is defined elsewhere
+        # plot_confusion_matrix(cm.cpu().numpy(), class_names=self.class_names)
+        self.test_conf_matrix.reset()
+    def configure_optimizers(self):
+        optimizer = torch.optim.Adam(
+            self.parameters(),
+            lr=self.hparams.lr,
+            weight_decay=self.hparams.weight_decay
+        )
+        scheduler = torch.optim.lr_scheduler.CosineAnnealingLR(
+            optimizer,
+            T_max=self.trainer.max_epochs,
+            eta_min=1e-6
+        )
+        return {
+            "optimizer": optimizer,
+            "lr_scheduler": {
+                "scheduler": scheduler,
+                "interval": "epoch",
+            },
+        }

scripts/prepare_data.py ADDED Viewed

	@@ -0,0 +1,240 @@

+from torchvision import datasets
+from pathlib import Path
+import os
+import pytorch_lightning as pl
+from torch.utils.data import DataLoader, Subset
+from torchvision import datasets
+from torchvision import transforms as T
+import numpy as np
+import torchvision
+from torchvision.datasets import Food101
+from torch.utils.data import DataLoader, Dataset
+from typing import Dict, Tuple, Any
+import random
+def get_model_components(
+    model_name: str,
+    return_classifier: bool = False,
+    augmentation_level: str = "default"
+) -> Dict[str, Any]:
+    """
+    Retrieves pre-trained model components from torchvision.
+    This function fetches the appropriate weights and transforms for a given
+    model. It supports different levels of training data augmentation.
+    Args:
+        model_name (str): The name of the model to get components for.
+            Supported models include "EfficientNet_V2_S" and "EfficientNet_B2".
+        return_classifier (bool, optional): If True, the model's classifier
+            head is also returned. Defaults to False.
+        augmentation_level (str, optional): The level of data augmentation to use
+            for the training set. Can be "default" or "strong".
+            Defaults to "default".
+    Returns:
+        Dict[str, Any]: A dictionary containing the requested components.
+            Always includes 'train_transforms' and 'val_transforms'.
+            Includes 'classifier' if return_classifier is True.
+    Raises:
+        ValueError: If model_name or augmentation_level is not supported.
+    """
+    model_registry = {
+        "EfficientNet_V2_S": (
+            torchvision.models.efficientnet_v2_s,
+            torchvision.models.EfficientNet_V2_S_Weights.DEFAULT
+        ),
+        "EfficientNet_B2": (
+            torchvision.models.efficientnet_b2,
+            torchvision.models.EfficientNet_B2_Weights.DEFAULT
+        )
+    }
+    if model_name not in model_registry:
+        raise ValueError(f"Model '{model_name}' is not supported. "
+                         f"Supported models are: {list(model_registry.keys())}")
+    # 1. Look up the model and weights classes
+    model_class, weights_class = model_registry[model_name]
+    weights = weights_class
+    val_transforms = weights.transforms()
+    # 2. Create the training transforms based on the desired level
+    if augmentation_level == "default":
+        train_transforms = T.Compose([
+            T.TrivialAugmentWide(),
+            val_transforms  # val_transforms includes ToTensor and Normalize
+        ])
+    elif augmentation_level == "strong":
+        # Note: We don't need to add ToTensor() or Normalize() here because
+        # they are already included inside the 'val_transforms' pipeline.
+        train_transforms = T.Compose([
+            T.RandomResizedCrop(size=val_transforms.crop_size, scale=(0.7, 1.0)),
+            T.RandomHorizontalFlip(p=0.5),
+            T.RandAugment(num_ops=2, magnitude=9),
+            # RandomErasing should be applied to a tensor, so we apply it after
+            # val_transforms, which handles the PIL -> Tensor conversion.
+            val_transforms,
+            T.RandomErasing(p=0.25, scale=(0.02, 0.33), ratio=(0.3, 3.3), value='random')
+        ])
+    else:
+        raise ValueError(f"Augmentation level '{augmentation_level}' is not supported. "
+                         f"Choose from 'default' or 'strong'.")
+    # 3. Prepare the dictionary to be returned
+    components = {
+        "train_transforms": train_transforms,
+        "val_transforms": val_transforms
+    }
+    # 4. Optionally, instantiate the model to get the classifier
+    if return_classifier:
+        model = model_class(weights=weights)
+        components["classifier"] = model.classifier
+    return components
+class CustomFood101(Dataset):
+    """A PyTorch Dataset for Food101 with conditional downloading and subset support.
+    This class wraps the torchvision Food101 dataset. It only downloads the data
+    if the specified directory doesn't already exist. It can also create a
+    reproducible, shuffled subset of the data for faster experimentation.
+    Args:
+        split (str): The dataset split, either "train" or "test".
+        transform (callable, optional): A function/transform to apply to the images.
+        data_dir (str, optional): The directory to store the data. Defaults to "data".
+        subset_fraction (float, optional): The fraction of the dataset to use.
+            Defaults to 1.0 (using the full dataset).
+    """
+    def __init__(self, split, transform=None, data_dir="data", subset_fraction: float = 0.1):
+        # Check if the dataset already exists before setting the download flag.
+        dataset_path = os.path.join(data_dir, "food-101")
+        should_download = not os.path.isdir(dataset_path)
+        # 1. Load the full dataset metadata with the conditional flag
+        self.full_dataset = Food101(root=data_dir, split=split, transform=transform, download=should_download)
+        self.classes = self.full_dataset.classes
+        # 2. Create a reproducible subset of indices
+        if subset_fraction < 1.0:
+            num_samples = int(len(self.full_dataset) * subset_fraction)
+            all_indices = list(range(len(self.full_dataset)))
+            # Shuffle with a fixed seed for reproducibility
+            random.Random(42).shuffle(all_indices)
+            self.indices = all_indices[:num_samples]
+        else:
+            self.indices = list(range(len(self.full_dataset)))
+    def __len__(self):
+        """Returns the total number of samples in the subset."""
+        return len(self.indices)
+    def __getitem__(self, idx):
+        """
+        Fetches the sample for the given subset index and applies the transform.
+        """
+        # Map the subset index to the actual index in the full dataset
+        original_idx = self.indices[idx]
+        image, label = self.full_dataset[original_idx]
+        return image, label
+class Food101DataModule(pl.LightningDataModule):
+    """A PyTorch Lightning DataModule for the Food101 dataset.
+    This module encapsulates all data-related logic, including downloading,
+    processing, and creating DataLoaders for the training, validation, and
+    test sets. It uses the CustomFood101 dataset internally and allows for
+    controlling the fraction of data used in the training and validation splits.
+    Args:
+        data_dir (str, optional): Root directory for the data. Defaults to "data".
+        batch_size (int, optional): The batch size for DataLoaders. Defaults to 32.
+        num_workers (int, optional): Number of workers for data loading. Defaults to 2.
+        train_transforms (callable, optional): Transformations for the training set.
+        val_transforms (callable, optional): Transformations for the validation/test set.
+        subset_fraction (float, optional): The fraction of data to use for training
+            and validation. Defaults to 1.0.
+    """
+    def __init__(self, data_dir="data", batch_size=32, num_workers=2,
+                 train_transforms=None, val_transforms=None, subset_fraction: float = 0.5):
+        super().__init__()
+        self.data_dir = data_dir
+        self.batch_size = batch_size
+        self.num_workers = num_workers
+        self.train_transforms = train_transforms
+        self.val_transforms = val_transforms
+        self.subset_fraction = subset_fraction
+        self.classes = []
+    def prepare_data(self):
+        """Downloads data if needed."""
+        CustomFood101(split='train', data_dir=self.data_dir)
+        CustomFood101(split='test', data_dir=self.data_dir)
+    def setup(self, stage=None):
+        """Assigns datasets, passing the subset_fraction."""
+        if stage == 'fit' or stage is None:
+            self.train_dataset = CustomFood101(split='train', transform=self.train_transforms,
+                                               data_dir=self.data_dir, subset_fraction=self.subset_fraction)
+            self.val_dataset = CustomFood101(split='test', transform=self.val_transforms,
+                                             data_dir=self.data_dir, subset_fraction=self.subset_fraction)
+            self.classes = self.train_dataset.classes
+        if stage == 'test' or stage is None:
+            self.test_dataset = CustomFood101(split='test', transform=self.val_transforms,
+                                              data_dir=self.data_dir, subset_fraction=1.0) # Use full test set
+            if not self.classes:
+                self.classes = self.test_dataset.classes
+    def train_dataloader(self):
+        return DataLoader(self.train_dataset, batch_size=self.batch_size, shuffle=True, num_workers=self.num_workers)
+    def val_dataloader(self):
+        return DataLoader(self.val_dataset, batch_size=self.batch_size, shuffle=False, num_workers=self.num_workers)
+    def test_dataloader(self):
+        return DataLoader(self.test_dataset, batch_size=self.batch_size, shuffle=False, num_workers=self.num_workers)
+if __name__ == "__main__":
+    # Define configuration for the script
+    DATA_DIR = "data"
+    MODEL_NAME = "EfficientNet_V2_S"
+    BATCH_SIZE = 32
+    print(f"Running data preparation script for model: {MODEL_NAME}")
+    # 1. Get model-specific transforms
+    components = get_model_components(MODEL_NAME)
+    train_transforms = components["train_transforms"]
+    val_transforms = components["val_transforms"]
+    # 2. Instantiate the DataModule
+    datamodule = Food101DataModule(
+        data_dir=DATA_DIR,
+        batch_size=BATCH_SIZE,
+        train_transforms=train_transforms,
+        val_transforms=val_transforms,
+        subset_fraction=0.1  # Use a small subset for quick verification
+    )
+    # 3. Trigger download and setup
+    datamodule.prepare_data()
+    datamodule.setup(stage='fit')
+    # 4. (Optional) Verification Step
+    print("\n--- Verifying Dataloader ---")
+    # Get one batch from the training dataloader
+    train_dl = datamodule.train_dataloader()
+    images, labels = next(iter(train_dl))
+    print(f"Number of classes: {len(datamodule.classes)}")
+    print(f"Image batch shape: {images.shape}")
+    print(f"Label batch shape: {labels.shape}")
+    print("--- Verification Complete ---")