| --- |
| license: mit |
| tags: |
| - vision |
| - image-classification |
| - pytorch |
| - efficientnet |
| datasets: |
| - Shad0wKillar/pizza_steak_sushi |
| metrics: |
| - accuracy |
| --- |
| # EfficientNet-B7 Pizza/Steak/Sushi Classifier |
|
|
| I fine-tuned a pre-trained EfficientNet-B7 model to classify images into three categories: pizza, steak, and sushi. |
|
|
| ## Model Details |
| * **Architecture:** `torchvision.models.efficientnet_b7` |
| * **Weights:** `EfficientNet_B7_Weights.DEFAULT` |
| * **Modifications:** I froze all the base feature layers. |
|
|
| ## Training Procedure |
| I trained the model for 10 epochs using the Adam optimizer. |
|
|
| * **Batch Size:** 32 |
| * **Learning Rate:** 0.001 |
| * **Loss Function:** CrossEntropyLoss |
| * **Transforms:** I used the automatic transforms provided by the default EfficientNet-B7 weights. |
| * **Hardware:** Trained using `cuda` (if available) with a set manual seed of 37 for reproducibility. |
|
|
| ## Dataset |
| I used a 20% subset of a pizza, steak, and sushi dataset. The data was split into `train` and `test` directories. |
|
|
| ## Evaluation Results |
|
|
| ### Accuracy and Loss Curves |
| Over the 10 epochs, the training and testing loss steadily decreased, with the testing loss ending near a highly impressive 0.15. The testing accuracy consistently outperformed the training accuracy throughout, finishing near a perfect 99%. |
|
|
|  |
|
|
| ### Confusion Matrix |
| This model achieved the best performance across all three classes on the test set, making almost no errors: |
| * **Pizza:** 46 correct, 0 misclassified. |
| * **Steak:** 57 correct, 0 misclassified as pizza, 1 misclassified as sushi. |
| * **Sushi:** 46 correct, 0 misclassified. |
|
|
|  |
|
|
| ### Most Confident Wrong Predictions |
| I plotted the single instance where the model was incorrect. The model only made one mistake in the entire test set, predicting a steak dish as sushi with a relatively low confidence of 0.56. |
|
|
|  |
|
|
| ## How to use |
| ```python |
| import torch |
| import torchvision |
| |
| # I loaded the model architecture |
| weights = torchvision.models.EfficientNet_B7_Weights.DEFAULT |
| model = torchvision.models.efficientnet_b7(weights=weights) |
| |
| # I modified the classifier |
| model.classifier = torch.nn.Sequential( |
| torch.nn.Dropout(p=0.2, inplace=True), |
| torch.nn.Linear(in_features=2560, out_features=3, bias=True), |
| ) |
| |
| # I loaded the saved weights |
| model.load_state_dict(torch.load("EfficientNet_B7_20percent.pth", map_location="cpu")) |
| model.eval() |