File size: 3,014 Bytes

---
license: apache-2.0
tags:
  - image-classification
  - cnn
  - pytorch
  - cat-vs-dog
  - deep-learning
library_name: pytorch
datasets:
  - custom
metrics:
  - name: accuracy
    type: accuracy
    value: 90.56
model_name: Cat vs Dog Classifier
pipeline_tag: image-classification
---
---
# 🐾 Cat vs Dog Classifier

This model is a deep convolutional neural network (CNN) built using PyTorch to classify images as either cats or dogs. It was trained on a labeled dataset of cat and dog images resized to 224×224 pixels, with extensive data augmentation and regularization techniques to improve generalization. The model achieves over 90% accuracy on the test set.

## 🧠 Model Overview

The classifier consists of four convolutional blocks followed by three fully connected layers. Each convolutional block includes a convolutional layer, batch normalization, ReLU activation, and max pooling. The fully connected layers include batch normalization and dropout for regularization.

### Convolutional Feature Extractor

- **Conv Block 1**:  
  - Input channels: 3 (RGB)  
  - Output channels: 64  
  - Kernel size: 3 × 3  
  - BatchNorm + ReLU + MaxPool (2 × 2)

- **Conv Block 2**:  
  - Input channels: 64  
  - Output channels: 128  
  - Kernel size: 3 × 3  
  - BatchNorm + ReLU + MaxPool (2 × 2)

- **Conv Block 3**:  
  - Input channels: 128  
  - Output channels: 256  
  - Kernel size: 3 × 3  
  - BatchNorm + ReLU + MaxPool (2 × 2)

- **Conv Block 4**:  
  - Input channels: 256  
  - Output channels: 512  
  - Kernel size: 3 × 3  
  - BatchNorm + ReLU + MaxPool (2 × 2)

After these blocks, the feature map is flattened to a 1D vector for classification.

### Fully Connected Classifier

- **FC Layer 1**:  
  - Input: Flattened feature vector  
  - Output: 512 units  
  - BatchNorm + ReLU + Dropout (p=0.5)

- **FC Layer 2**:  
  - Input: 512 units  
  - Output: 256 units  
  - BatchNorm + ReLU + Dropout (p=0.5)

- **FC Layer 3**:  
  - Input: 256 units  
  - Output: 2 units (cat or dog)

## 🧪 Training Details

- **Framework**: PyTorch  
- **Input size**: 224 × 224 RGB images  
- **Optimizer**: Adam with learning rate 0.0001 and weight decay 1e-4  
- **Loss Function**: CrossEntropyLoss  
- **Scheduler**: ReduceLROnPlateau (monitors validation accuracy)  
- **Epochs**: 40  
- **Batch Size**: 32  
- **Validation Split**: 20% of training data  
- **Best Validation Accuracy**: 90.26%  
- **Final Test Accuracy**: 90.56%

## 🎨 Data Augmentation

To improve generalization, the training data was augmented with:

- Random horizontal flips  
- Random rotations (±10 degrees)  
- Color jitter (brightness and contrast)  
- Normalization using ImageNet mean and standard deviation

## 📄 License

This model is licensed under the Apache 2.0 License. You are free to use, modify, and distribute it with proper attribution.

## 🙋 Author

Created by Sathvik as part of a deep learning exploration project focused on image classification and CNN architecture optimization.