|
|
--- |
|
|
license: apache-2.0 |
|
|
tags: |
|
|
- image-classification |
|
|
- cnn |
|
|
- pytorch |
|
|
- cat-vs-dog |
|
|
- deep-learning |
|
|
library_name: pytorch |
|
|
datasets: |
|
|
- custom |
|
|
metrics: |
|
|
- name: accuracy |
|
|
type: accuracy |
|
|
value: 90.56 |
|
|
model_name: Cat vs Dog Classifier |
|
|
pipeline_tag: image-classification |
|
|
--- |
|
|
--- |
|
|
# ๐พ Cat vs Dog Classifier |
|
|
|
|
|
This model is a deep convolutional neural network (CNN) built using PyTorch to classify images as either cats or dogs. It was trained on a labeled dataset of cat and dog images resized to 224ร224 pixels, with extensive data augmentation and regularization techniques to improve generalization. The model achieves over 90% accuracy on the test set. |
|
|
|
|
|
## ๐ง Model Overview |
|
|
|
|
|
The classifier consists of four convolutional blocks followed by three fully connected layers. Each convolutional block includes a convolutional layer, batch normalization, ReLU activation, and max pooling. The fully connected layers include batch normalization and dropout for regularization. |
|
|
|
|
|
### Convolutional Feature Extractor |
|
|
|
|
|
- **Conv Block 1**: |
|
|
- Input channels: 3 (RGB) |
|
|
- Output channels: 64 |
|
|
- Kernel size: 3 ร 3 |
|
|
- BatchNorm + ReLU + MaxPool (2 ร 2) |
|
|
|
|
|
- **Conv Block 2**: |
|
|
- Input channels: 64 |
|
|
- Output channels: 128 |
|
|
- Kernel size: 3 ร 3 |
|
|
- BatchNorm + ReLU + MaxPool (2 ร 2) |
|
|
|
|
|
- **Conv Block 3**: |
|
|
- Input channels: 128 |
|
|
- Output channels: 256 |
|
|
- Kernel size: 3 ร 3 |
|
|
- BatchNorm + ReLU + MaxPool (2 ร 2) |
|
|
|
|
|
- **Conv Block 4**: |
|
|
- Input channels: 256 |
|
|
- Output channels: 512 |
|
|
- Kernel size: 3 ร 3 |
|
|
- BatchNorm + ReLU + MaxPool (2 ร 2) |
|
|
|
|
|
After these blocks, the feature map is flattened to a 1D vector for classification. |
|
|
|
|
|
### Fully Connected Classifier |
|
|
|
|
|
- **FC Layer 1**: |
|
|
- Input: Flattened feature vector |
|
|
- Output: 512 units |
|
|
- BatchNorm + ReLU + Dropout (p=0.5) |
|
|
|
|
|
- **FC Layer 2**: |
|
|
- Input: 512 units |
|
|
- Output: 256 units |
|
|
- BatchNorm + ReLU + Dropout (p=0.5) |
|
|
|
|
|
- **FC Layer 3**: |
|
|
- Input: 256 units |
|
|
- Output: 2 units (cat or dog) |
|
|
|
|
|
## ๐งช Training Details |
|
|
|
|
|
- **Framework**: PyTorch |
|
|
- **Input size**: 224 ร 224 RGB images |
|
|
- **Optimizer**: Adam with learning rate 0.0001 and weight decay 1e-4 |
|
|
- **Loss Function**: CrossEntropyLoss |
|
|
- **Scheduler**: ReduceLROnPlateau (monitors validation accuracy) |
|
|
- **Epochs**: 40 |
|
|
- **Batch Size**: 32 |
|
|
- **Validation Split**: 20% of training data |
|
|
- **Best Validation Accuracy**: 90.26% |
|
|
- **Final Test Accuracy**: 90.56% |
|
|
|
|
|
## ๐จ Data Augmentation |
|
|
|
|
|
To improve generalization, the training data was augmented with: |
|
|
|
|
|
- Random horizontal flips |
|
|
- Random rotations (ยฑ10 degrees) |
|
|
- Color jitter (brightness and contrast) |
|
|
- Normalization using ImageNet mean and standard deviation |
|
|
|
|
|
## ๐ License |
|
|
|
|
|
This model is licensed under the Apache 2.0 License. You are free to use, modify, and distribute it with proper attribution. |
|
|
|
|
|
## ๐ Author |
|
|
|
|
|
Created by Sathvik as part of a deep learning exploration project focused on image classification and CNN architecture optimization. |
|
|
|
|
|
|