--- license: apache-2.0 tags: - image-classification - cnn - pytorch - cat-vs-dog - deep-learning library_name: pytorch datasets: - custom metrics: - name: accuracy type: accuracy value: 90.56 model_name: Cat vs Dog Classifier pipeline_tag: image-classification --- --- # ๐Ÿพ Cat vs Dog Classifier This model is a deep convolutional neural network (CNN) built using PyTorch to classify images as either cats or dogs. It was trained on a labeled dataset of cat and dog images resized to 224ร—224 pixels, with extensive data augmentation and regularization techniques to improve generalization. The model achieves over 90% accuracy on the test set. ## ๐Ÿง  Model Overview The classifier consists of four convolutional blocks followed by three fully connected layers. Each convolutional block includes a convolutional layer, batch normalization, ReLU activation, and max pooling. The fully connected layers include batch normalization and dropout for regularization. ### Convolutional Feature Extractor - **Conv Block 1**: - Input channels: 3 (RGB) - Output channels: 64 - Kernel size: 3 ร— 3 - BatchNorm + ReLU + MaxPool (2 ร— 2) - **Conv Block 2**: - Input channels: 64 - Output channels: 128 - Kernel size: 3 ร— 3 - BatchNorm + ReLU + MaxPool (2 ร— 2) - **Conv Block 3**: - Input channels: 128 - Output channels: 256 - Kernel size: 3 ร— 3 - BatchNorm + ReLU + MaxPool (2 ร— 2) - **Conv Block 4**: - Input channels: 256 - Output channels: 512 - Kernel size: 3 ร— 3 - BatchNorm + ReLU + MaxPool (2 ร— 2) After these blocks, the feature map is flattened to a 1D vector for classification. ### Fully Connected Classifier - **FC Layer 1**: - Input: Flattened feature vector - Output: 512 units - BatchNorm + ReLU + Dropout (p=0.5) - **FC Layer 2**: - Input: 512 units - Output: 256 units - BatchNorm + ReLU + Dropout (p=0.5) - **FC Layer 3**: - Input: 256 units - Output: 2 units (cat or dog) ## ๐Ÿงช Training Details - **Framework**: PyTorch - **Input size**: 224 ร— 224 RGB images - **Optimizer**: Adam with learning rate 0.0001 and weight decay 1e-4 - **Loss Function**: CrossEntropyLoss - **Scheduler**: ReduceLROnPlateau (monitors validation accuracy) - **Epochs**: 40 - **Batch Size**: 32 - **Validation Split**: 20% of training data - **Best Validation Accuracy**: 90.26% - **Final Test Accuracy**: 90.56% ## ๐ŸŽจ Data Augmentation To improve generalization, the training data was augmented with: - Random horizontal flips - Random rotations (ยฑ10 degrees) - Color jitter (brightness and contrast) - Normalization using ImageNet mean and standard deviation ## ๐Ÿ“„ License This model is licensed under the Apache 2.0 License. You are free to use, modify, and distribute it with proper attribution. ## ๐Ÿ™‹ Author Created by Sathvik as part of a deep learning exploration project focused on image classification and CNN architecture optimization.