🐾 Cat vs Dog Classifier

This model is a deep convolutional neural network (CNN) built using PyTorch to classify images as either cats or dogs. It was trained on a labeled dataset of cat and dog images resized to 224Γ—224 pixels, with extensive data augmentation and regularization techniques to improve generalization. The model achieves over 90% accuracy on the test set.

🧠 Model Overview

The classifier consists of four convolutional blocks followed by three fully connected layers. Each convolutional block includes a convolutional layer, batch normalization, ReLU activation, and max pooling. The fully connected layers include batch normalization and dropout for regularization.

Convolutional Feature Extractor

  • Conv Block 1:

    • Input channels: 3 (RGB)
    • Output channels: 64
    • Kernel size: 3 Γ— 3
    • BatchNorm + ReLU + MaxPool (2 Γ— 2)
  • Conv Block 2:

    • Input channels: 64
    • Output channels: 128
    • Kernel size: 3 Γ— 3
    • BatchNorm + ReLU + MaxPool (2 Γ— 2)
  • Conv Block 3:

    • Input channels: 128
    • Output channels: 256
    • Kernel size: 3 Γ— 3
    • BatchNorm + ReLU + MaxPool (2 Γ— 2)
  • Conv Block 4:

    • Input channels: 256
    • Output channels: 512
    • Kernel size: 3 Γ— 3
    • BatchNorm + ReLU + MaxPool (2 Γ— 2)

After these blocks, the feature map is flattened to a 1D vector for classification.

Fully Connected Classifier

  • FC Layer 1:

    • Input: Flattened feature vector
    • Output: 512 units
    • BatchNorm + ReLU + Dropout (p=0.5)
  • FC Layer 2:

    • Input: 512 units
    • Output: 256 units
    • BatchNorm + ReLU + Dropout (p=0.5)
  • FC Layer 3:

    • Input: 256 units
    • Output: 2 units (cat or dog)

πŸ§ͺ Training Details

  • Framework: PyTorch
  • Input size: 224 Γ— 224 RGB images
  • Optimizer: Adam with learning rate 0.0001 and weight decay 1e-4
  • Loss Function: CrossEntropyLoss
  • Scheduler: ReduceLROnPlateau (monitors validation accuracy)
  • Epochs: 40
  • Batch Size: 32
  • Validation Split: 20% of training data
  • Best Validation Accuracy: 90.26%
  • Final Test Accuracy: 90.56%

🎨 Data Augmentation

To improve generalization, the training data was augmented with:

  • Random horizontal flips
  • Random rotations (Β±10 degrees)
  • Color jitter (brightness and contrast)
  • Normalization using ImageNet mean and standard deviation

πŸ“„ License

This model is licensed under the Apache 2.0 License. You are free to use, modify, and distribute it with proper attribution.

πŸ™‹ Author

Created by Sathvik as part of a deep learning exploration project focused on image classification and CNN architecture optimization.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support