Shad0wKillar
/

efficientnet-b1

Image Classification

Model card Files Files and versions

Shad0wKillar commited on about 1 month ago

Commit

67dc4b9

·

verified ·

1 Parent(s): 6af4ed3

Update README.md

Files changed (1) hide show

README.md +72 -3

README.md CHANGED Viewed

@@ -1,3 +1,72 @@
----
-license: apache-2.0
----

+---
+license: mit
+tags:
+- vision
+- image-classification
+- pytorch
+- efficientnet
+datasets:
+- pizza-steak-sushi-20-percent
+metrics:
+- accuracy
+---
+# EfficientNet-B1 Pizza/Steak/Sushi Classifier
+I fine-tuned a pre-trained EfficientNet-B1 model to classify images into three categories: pizza, steak, and sushi[cite: 1].
+## Model Details
+* **Architecture:** `torchvision.models.efficientnet_b1`[cite: 1]
+* **Weights:** `EfficientNet_B1_Weights.DEFAULT`[cite: 1]
+* **Modifications:** I froze all the base feature layers[cite: 1]. I replaced the classifier head with a `Dropout(p=0.2)` and a `Linear` layer mapping from 1280 in_features to 3 out_features[cite: 1].
+## Training Procedure
+I trained the model for 10 epochs using the Adam optimizer[cite: 1].
+* **Batch Size:** 32[cite: 1]
+* **Learning Rate:** 0.001[cite: 1]
+* **Loss Function:** CrossEntropyLoss[cite: 1]
+* **Transforms:** I used the automatic transforms provided by the default EfficientNet-B1 weights[cite: 1].
+* **Hardware:** Trained using `cuda` (if available) with a set manual seed of 37 for reproducibility[cite: 1].
+## Dataset
+I used a 20% subset of a pizza, steak, and sushi dataset[cite: 1]. The data was split into `train` and `test` directories[cite: 1].
+## Evaluation Results
+### Accuracy and Loss Curves
+Over the 10 epochs, both the training and testing loss steadily decreased, with the testing loss ending below 0.40[cite: 1]. The testing accuracy outperformed the training accuracy early on and finished highly stable above 90%[cite: 1].
+![Loss and Accuracy Curves](plots/EfficientNet_B1_20percent.pth_curves.png)
+### Confusion Matrix
+The model performs exceptionally well across all three classes on the test set:
+* **Pizza:** 45 correct, 0 misclassified as steak, 1 misclassified as sushi[cite: 1].
+* **Steak:** 56 correct, 0 misclassified as pizza, 2 misclassified as sushi[cite: 1].
+* **Sushi:** 42 correct, 3 misclassified as pizza, 1 misclassified as steak[cite: 1].
+![Confusion Matrix](plots/EfficientNet_B1_20percent.pth_confusion_matrix.png)
+### Most Confident Wrong Predictions
+I plotted the instances where the model was highly confident but incorrect[cite: 1]. The model occasionally struggled with distinguishing close-up textures, such as predicting a steak dish as sushi with 0.82 confidence, or a sushi dish as pizza with 0.61 confidence[cite: 1].
+![Wrong Predictions](plots/EfficientNet_B1_20percent.pth_wrong_pred.png)
+## How to use
+```python
+import torch
+import torchvision
+# 1. Load the model architecture
+weights = torchvision.models.EfficientNet_B1_Weights.DEFAULT
+model = torchvision.models.efficientnet_b1(weights=weights)
+# 2. Modify the classifier to match the 3 classes
+model.classifier = torch.nn.Sequential(
+    torch.nn.Dropout(p=0.2, inplace=True),
+    torch.nn.Linear(in_features=1280, out_features=3, bias=True),
+)
+# 3. Load the weights
+model.load_state_dict(torch.load("EfficientNet_B1_20percent.pth", map_location="cpu"))
+model.eval()