Update README.md
Browse files
README.md
CHANGED
|
@@ -1,3 +1,72 @@
|
|
| 1 |
-
---
|
| 2 |
-
license:
|
| 3 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
license: mit
|
| 3 |
+
tags:
|
| 4 |
+
- vision
|
| 5 |
+
- image-classification
|
| 6 |
+
- pytorch
|
| 7 |
+
- efficientnet
|
| 8 |
+
datasets:
|
| 9 |
+
- pizza-steak-sushi-20-percent
|
| 10 |
+
metrics:
|
| 11 |
+
- accuracy
|
| 12 |
+
---
|
| 13 |
+
|
| 14 |
+
# EfficientNet-B1 Pizza/Steak/Sushi Classifier
|
| 15 |
+
|
| 16 |
+
I fine-tuned a pre-trained EfficientNet-B1 model to classify images into three categories: pizza, steak, and sushi[cite: 1].
|
| 17 |
+
|
| 18 |
+
## Model Details
|
| 19 |
+
* **Architecture:** `torchvision.models.efficientnet_b1`[cite: 1]
|
| 20 |
+
* **Weights:** `EfficientNet_B1_Weights.DEFAULT`[cite: 1]
|
| 21 |
+
* **Modifications:** I froze all the base feature layers[cite: 1]. I replaced the classifier head with a `Dropout(p=0.2)` and a `Linear` layer mapping from 1280 in_features to 3 out_features[cite: 1].
|
| 22 |
+
|
| 23 |
+
## Training Procedure
|
| 24 |
+
I trained the model for 10 epochs using the Adam optimizer[cite: 1].
|
| 25 |
+
|
| 26 |
+
* **Batch Size:** 32[cite: 1]
|
| 27 |
+
* **Learning Rate:** 0.001[cite: 1]
|
| 28 |
+
* **Loss Function:** CrossEntropyLoss[cite: 1]
|
| 29 |
+
* **Transforms:** I used the automatic transforms provided by the default EfficientNet-B1 weights[cite: 1].
|
| 30 |
+
* **Hardware:** Trained using `cuda` (if available) with a set manual seed of 37 for reproducibility[cite: 1].
|
| 31 |
+
|
| 32 |
+
## Dataset
|
| 33 |
+
I used a 20% subset of a pizza, steak, and sushi dataset[cite: 1]. The data was split into `train` and `test` directories[cite: 1].
|
| 34 |
+
|
| 35 |
+
## Evaluation Results
|
| 36 |
+
|
| 37 |
+
### Accuracy and Loss Curves
|
| 38 |
+
Over the 10 epochs, both the training and testing loss steadily decreased, with the testing loss ending below 0.40[cite: 1]. The testing accuracy outperformed the training accuracy early on and finished highly stable above 90%[cite: 1].
|
| 39 |
+
|
| 40 |
+

|
| 41 |
+
|
| 42 |
+
### Confusion Matrix
|
| 43 |
+
The model performs exceptionally well across all three classes on the test set:
|
| 44 |
+
* **Pizza:** 45 correct, 0 misclassified as steak, 1 misclassified as sushi[cite: 1].
|
| 45 |
+
* **Steak:** 56 correct, 0 misclassified as pizza, 2 misclassified as sushi[cite: 1].
|
| 46 |
+
* **Sushi:** 42 correct, 3 misclassified as pizza, 1 misclassified as steak[cite: 1].
|
| 47 |
+
|
| 48 |
+

|
| 49 |
+
|
| 50 |
+
### Most Confident Wrong Predictions
|
| 51 |
+
I plotted the instances where the model was highly confident but incorrect[cite: 1]. The model occasionally struggled with distinguishing close-up textures, such as predicting a steak dish as sushi with 0.82 confidence, or a sushi dish as pizza with 0.61 confidence[cite: 1].
|
| 52 |
+
|
| 53 |
+

|
| 54 |
+
|
| 55 |
+
## How to use
|
| 56 |
+
```python
|
| 57 |
+
import torch
|
| 58 |
+
import torchvision
|
| 59 |
+
|
| 60 |
+
# 1. Load the model architecture
|
| 61 |
+
weights = torchvision.models.EfficientNet_B1_Weights.DEFAULT
|
| 62 |
+
model = torchvision.models.efficientnet_b1(weights=weights)
|
| 63 |
+
|
| 64 |
+
# 2. Modify the classifier to match the 3 classes
|
| 65 |
+
model.classifier = torch.nn.Sequential(
|
| 66 |
+
torch.nn.Dropout(p=0.2, inplace=True),
|
| 67 |
+
torch.nn.Linear(in_features=1280, out_features=3, bias=True),
|
| 68 |
+
)
|
| 69 |
+
|
| 70 |
+
# 3. Load the weights
|
| 71 |
+
model.load_state_dict(torch.load("EfficientNet_B1_20percent.pth", map_location="cpu"))
|
| 72 |
+
model.eval()
|