sathvik333's picture
Update README.md
a08f7f5 verified
metadata
license: apache-2.0
tags:
  - image-classification
  - cnn
  - pytorch
  - cat-vs-dog
  - deep-learning
library_name: pytorch
datasets:
  - custom
metrics:
  - name: accuracy
    type: accuracy
    value: 90.56
model_name: Cat vs Dog Classifier
pipeline_tag: image-classification

🐾 Cat vs Dog Classifier

This model is a deep convolutional neural network (CNN) built using PyTorch to classify images as either cats or dogs. It was trained on a labeled dataset of cat and dog images resized to 224Γ—224 pixels, with extensive data augmentation and regularization techniques to improve generalization. The model achieves over 90% accuracy on the test set.

🧠 Model Overview

The classifier consists of four convolutional blocks followed by three fully connected layers. Each convolutional block includes a convolutional layer, batch normalization, ReLU activation, and max pooling. The fully connected layers include batch normalization and dropout for regularization.

Convolutional Feature Extractor

  • Conv Block 1:

    • Input channels: 3 (RGB)
    • Output channels: 64
    • Kernel size: 3 Γ— 3
    • BatchNorm + ReLU + MaxPool (2 Γ— 2)
  • Conv Block 2:

    • Input channels: 64
    • Output channels: 128
    • Kernel size: 3 Γ— 3
    • BatchNorm + ReLU + MaxPool (2 Γ— 2)
  • Conv Block 3:

    • Input channels: 128
    • Output channels: 256
    • Kernel size: 3 Γ— 3
    • BatchNorm + ReLU + MaxPool (2 Γ— 2)
  • Conv Block 4:

    • Input channels: 256
    • Output channels: 512
    • Kernel size: 3 Γ— 3
    • BatchNorm + ReLU + MaxPool (2 Γ— 2)

After these blocks, the feature map is flattened to a 1D vector for classification.

Fully Connected Classifier

  • FC Layer 1:

    • Input: Flattened feature vector
    • Output: 512 units
    • BatchNorm + ReLU + Dropout (p=0.5)
  • FC Layer 2:

    • Input: 512 units
    • Output: 256 units
    • BatchNorm + ReLU + Dropout (p=0.5)
  • FC Layer 3:

    • Input: 256 units
    • Output: 2 units (cat or dog)

πŸ§ͺ Training Details

  • Framework: PyTorch
  • Input size: 224 Γ— 224 RGB images
  • Optimizer: Adam with learning rate 0.0001 and weight decay 1e-4
  • Loss Function: CrossEntropyLoss
  • Scheduler: ReduceLROnPlateau (monitors validation accuracy)
  • Epochs: 40
  • Batch Size: 32
  • Validation Split: 20% of training data
  • Best Validation Accuracy: 90.26%
  • Final Test Accuracy: 90.56%

🎨 Data Augmentation

To improve generalization, the training data was augmented with:

  • Random horizontal flips
  • Random rotations (Β±10 degrees)
  • Color jitter (brightness and contrast)
  • Normalization using ImageNet mean and standard deviation

πŸ“„ License

This model is licensed under the Apache 2.0 License. You are free to use, modify, and distribute it with proper attribution.

πŸ™‹ Author

Created by Sathvik as part of a deep learning exploration project focused on image classification and CNN architecture optimization.