File size: 3,858 Bytes

---
license: mit
tags:
  - image-classification
  - cats-vs-dogs
  - tensorflow
  - keras
  - efficientnet
datasets:
  - microsoft/cats_vs_dogs
metrics:
  - accuracy
  - auc
  - loss
---

# Cat vs. Dog Image Classification

This is a Keras image classification model trained to distinguish between images of cats and dogs. The model is based on the `EfficientNetB1` architecture and was trained on a dataset of cat and dog images.

## Model Architecture

The model uses `EfficientNetB1` pre-trained on ImageNet as its base. The architecture is as follows:

1.  **Input Layer**: Accepts images of size `(240, 240, 3)`.
2.  **Data Augmentation**: Applies random transformations to the input images to improve generalization:
    *   `RandomFlip("horizontal")`
    *   `RandomRotation(0.1)`
    *   `RandomZoom(0.1)`
    *   `RandomContrast(0.1)`
    *   `RandomBrightness(0.1)`
3.  **Base Model**: `EfficientNetB1` (with weights frozen during the initial training phase).
4.  **Classification Head**:
    *   `GlobalAveragePooling2D`
    *   `Dropout(0.2)`
    *   `Dense(1, activation="sigmoid")` for binary classification.

## Training Procedure

The model was trained in two stages:

1.  **Transfer Learning**: The `EfficientNetB1` base was frozen, and only the classification head was trained for 50 epochs. This allows the model to learn to classify cats and dogs using the features learned from ImageNet.
2.  **Fine-Tuning**: The top 20 layers of the `EfficientNetB1` base were unfrozen and the entire model was trained for an additional 50 epochs with a lower learning rate. This fine-tunes the pre-trained features for the specific task of cat vs. dog classification.

Key training parameters:
- **Optimizer**: `AdamW`
- **Loss Function**: `binary_crossentropy`
- **Learning Rate Schedule**: `CosineDecayRestarts`
- **Metrics**: `accuracy`, `AUC`
- **Batch Size**: 16

## Evaluation Results

The model was evaluated on a test set of 3,512 images, achieving the following performance:

| Metric   | Value  |
|----------|--------|
| Loss     | 0.0338 |
| Accuracy | 99.54% |
| AUC      | 0.9994 |

## How to Use

You can use this model for inference with TensorFlow and Keras.

First, make sure you have TensorFlow installed:
```bash
pip install tensorflow
```

Then, you can load the model and use it to predict on a new image:

```python
import tensorflow as tf
import numpy as np
from tensorflow.keras.preprocessing import image

model = tf.keras.models.load_model('path/to/your/model.keras')

img_path = 'path/to/your/image.jpg'
img = image.load_img(img_path, target_size=(240, 240))
img_array = image.img_to_array(img)
img_array = np.expand_dims(img_array, axis=0)

preprocessed_img = tf.keras.applications.efficientnet.preprocess_input(img_array)

prediction = model.predict(preprocessed_img)
score = prediction[0][0]

print(
    f"This image is {100 * (1 - score):.2f}% cat and {100 * score:.2f}% dog."
)
```
**Note**: The model outputs a single value between 0 and 1. A value closer to 0 indicates a 'cat', and a value closer to 1 indicates a 'dog'. The exact labels depend on how they were encoded during training (e.g., cat=0, dog=1).

## Dataset Credits

The training data is the publicly available
[microsoft/cats_vs_dogs](https://huggingface.co/datasets/microsoft/cats_vs_dogs)
dataset (originally the Asirra CAPTCHA dataset). **Huge thanks** to Microsoft
Research and Petfinder.com for releasing the images!

```
@misc{microsoftcatsdogs,
  title  = {Cats vs. Dogs Image Dataset},
  author = {Microsoft Research & Petfinder.com},
  howpublished = {HuggingFace Hub},
  url    = {https://huggingface.co/datasets/microsoft/cats_vs_dogs}
}
```

## Acknowledgements

* TensorFlow/Keras team for the excellent deep-learning framework.
* Mingxing Tan & Quoc V. Le for EfficientNet.
* The Hugging Face community for the awesome Model & Dataset hubs.