Cat_or_Dog.Classifier / README.md

sathvik333

Update README.md

a08f7f5 verified 3 months ago

preview code

raw

history blame contribute delete

3.01 kB

metadata

license: apache-2.0
tags:
  - image-classification
  - cnn
  - pytorch
  - cat-vs-dog
  - deep-learning
library_name: pytorch
datasets:
  - custom
metrics:
  - name: accuracy
    type: accuracy
    value: 90.56
model_name: Cat vs Dog Classifier
pipeline_tag: image-classification

🐾 Cat vs Dog Classifier

This model is a deep convolutional neural network (CNN) built using PyTorch to classify images as either cats or dogs. It was trained on a labeled dataset of cat and dog images resized to 224×224 pixels, with extensive data augmentation and regularization techniques to improve generalization. The model achieves over 90% accuracy on the test set.

🧠 Model Overview

The classifier consists of four convolutional blocks followed by three fully connected layers. Each convolutional block includes a convolutional layer, batch normalization, ReLU activation, and max pooling. The fully connected layers include batch normalization and dropout for regularization.

Convolutional Feature Extractor

Conv Block 1:
- Input channels: 3 (RGB)
- Output channels: 64
- Kernel size: 3 × 3
- BatchNorm + ReLU + MaxPool (2 × 2)
Conv Block 2:
- Input channels: 64
- Output channels: 128
- Kernel size: 3 × 3
- BatchNorm + ReLU + MaxPool (2 × 2)
Conv Block 3:
- Input channels: 128
- Output channels: 256
- Kernel size: 3 × 3
- BatchNorm + ReLU + MaxPool (2 × 2)
Conv Block 4:
- Input channels: 256
- Output channels: 512
- Kernel size: 3 × 3
- BatchNorm + ReLU + MaxPool (2 × 2)

After these blocks, the feature map is flattened to a 1D vector for classification.

Fully Connected Classifier

FC Layer 1:
- Input: Flattened feature vector
- Output: 512 units
- BatchNorm + ReLU + Dropout (p=0.5)
FC Layer 2:
- Input: 512 units
- Output: 256 units
- BatchNorm + ReLU + Dropout (p=0.5)
FC Layer 3:
- Input: 256 units
- Output: 2 units (cat or dog)

🧪 Training Details

Framework: PyTorch
Input size: 224 × 224 RGB images
Optimizer: Adam with learning rate 0.0001 and weight decay 1e-4
Loss Function: CrossEntropyLoss
Scheduler: ReduceLROnPlateau (monitors validation accuracy)
Epochs: 40
Batch Size: 32
Validation Split: 20% of training data
Best Validation Accuracy: 90.26%
Final Test Accuracy: 90.56%

🎨 Data Augmentation

To improve generalization, the training data was augmented with:

Random horizontal flips
Random rotations (±10 degrees)
Color jitter (brightness and contrast)
Normalization using ImageNet mean and standard deviation

📄 License

This model is licensed under the Apache 2.0 License. You are free to use, modify, and distribute it with proper attribution.

🙋 Author

Created by Sathvik as part of a deep learning exploration project focused on image classification and CNN architecture optimization.