Create README.md
Browse files
README.md
ADDED
|
@@ -0,0 +1,71 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
license: mit
|
| 3 |
+
tags:
|
| 4 |
+
- vision
|
| 5 |
+
- image-classification
|
| 6 |
+
- pytorch
|
| 7 |
+
- efficientnet
|
| 8 |
+
datasets:
|
| 9 |
+
- Shad0wKillar/pizza_steak_sushi
|
| 10 |
+
metrics:
|
| 11 |
+
- accuracy
|
| 12 |
+
---
|
| 13 |
+
# EfficientNet-B7 Pizza/Steak/Sushi Classifier
|
| 14 |
+
|
| 15 |
+
I fine-tuned a pre-trained EfficientNet-B7 model to classify images into three categories: pizza, steak, and sushi.
|
| 16 |
+
|
| 17 |
+
## Model Details
|
| 18 |
+
* **Architecture:** `torchvision.models.efficientnet_b7`
|
| 19 |
+
* **Weights:** `EfficientNet_B7_Weights.DEFAULT`
|
| 20 |
+
* **Modifications:** I froze all the base feature layers.
|
| 21 |
+
|
| 22 |
+
## Training Procedure
|
| 23 |
+
I trained the model for 10 epochs using the Adam optimizer.
|
| 24 |
+
|
| 25 |
+
* **Batch Size:** 32
|
| 26 |
+
* **Learning Rate:** 0.001
|
| 27 |
+
* **Loss Function:** CrossEntropyLoss
|
| 28 |
+
* **Transforms:** I used the automatic transforms provided by the default EfficientNet-B7 weights.
|
| 29 |
+
* **Hardware:** Trained using `cuda` (if available) with a set manual seed of 37 for reproducibility.
|
| 30 |
+
|
| 31 |
+
## Dataset
|
| 32 |
+
I used a 20% subset of a pizza, steak, and sushi dataset. The data was split into `train` and `test` directories.
|
| 33 |
+
|
| 34 |
+
## Evaluation Results
|
| 35 |
+
|
| 36 |
+
### Accuracy and Loss Curves
|
| 37 |
+
Over the 10 epochs, the training and testing loss steadily decreased, with the testing loss ending near a highly impressive 0.15. The testing accuracy consistently outperformed the training accuracy throughout, finishing near a perfect 99%.
|
| 38 |
+
|
| 39 |
+

|
| 40 |
+
|
| 41 |
+
### Confusion Matrix
|
| 42 |
+
This model achieved the best performance across all three classes on the test set, making almost no errors:
|
| 43 |
+
* **Pizza:** 46 correct, 0 misclassified.
|
| 44 |
+
* **Steak:** 57 correct, 0 misclassified as pizza, 1 misclassified as sushi.
|
| 45 |
+
* **Sushi:** 46 correct, 0 misclassified.
|
| 46 |
+
|
| 47 |
+

|
| 48 |
+
|
| 49 |
+
### Most Confident Wrong Predictions
|
| 50 |
+
I plotted the single instance where the model was incorrect. The model only made one mistake in the entire test set, predicting a steak dish as sushi with a relatively low confidence of 0.56.
|
| 51 |
+
|
| 52 |
+

|
| 53 |
+
|
| 54 |
+
## How to use
|
| 55 |
+
```python
|
| 56 |
+
import torch
|
| 57 |
+
import torchvision
|
| 58 |
+
|
| 59 |
+
# I loaded the model architecture
|
| 60 |
+
weights = torchvision.models.EfficientNet_B7_Weights.DEFAULT
|
| 61 |
+
model = torchvision.models.efficientnet_b7(weights=weights)
|
| 62 |
+
|
| 63 |
+
# I modified the classifier
|
| 64 |
+
model.classifier = torch.nn.Sequential(
|
| 65 |
+
torch.nn.Dropout(p=0.2, inplace=True),
|
| 66 |
+
torch.nn.Linear(in_features=2560, out_features=3, bias=True),
|
| 67 |
+
)
|
| 68 |
+
|
| 69 |
+
# I loaded the saved weights
|
| 70 |
+
model.load_state_dict(torch.load("EfficientNet_B7_20percent.pth", map_location="cpu"))
|
| 71 |
+
model.eval()
|