|
|
--- |
|
|
license: mit |
|
|
tags: |
|
|
- image-classification |
|
|
- cats-vs-dogs |
|
|
- tensorflow |
|
|
- keras |
|
|
- efficientnet |
|
|
datasets: |
|
|
- microsoft/cats_vs_dogs |
|
|
metrics: |
|
|
- accuracy |
|
|
- auc |
|
|
- loss |
|
|
--- |
|
|
|
|
|
# Cat vs. Dog Image Classification |
|
|
|
|
|
This is a Keras image classification model trained to distinguish between images of cats and dogs. The model is based on the `EfficientNetB1` architecture and was trained on a dataset of cat and dog images. |
|
|
|
|
|
## Model Architecture |
|
|
|
|
|
The model uses `EfficientNetB1` pre-trained on ImageNet as its base. The architecture is as follows: |
|
|
|
|
|
1. **Input Layer**: Accepts images of size `(240, 240, 3)`. |
|
|
2. **Data Augmentation**: Applies random transformations to the input images to improve generalization: |
|
|
* `RandomFlip("horizontal")` |
|
|
* `RandomRotation(0.1)` |
|
|
* `RandomZoom(0.1)` |
|
|
* `RandomContrast(0.1)` |
|
|
* `RandomBrightness(0.1)` |
|
|
3. **Base Model**: `EfficientNetB1` (with weights frozen during the initial training phase). |
|
|
4. **Classification Head**: |
|
|
* `GlobalAveragePooling2D` |
|
|
* `Dropout(0.2)` |
|
|
* `Dense(1, activation="sigmoid")` for binary classification. |
|
|
|
|
|
## Training Procedure |
|
|
|
|
|
The model was trained in two stages: |
|
|
|
|
|
1. **Transfer Learning**: The `EfficientNetB1` base was frozen, and only the classification head was trained for 50 epochs. This allows the model to learn to classify cats and dogs using the features learned from ImageNet. |
|
|
2. **Fine-Tuning**: The top 20 layers of the `EfficientNetB1` base were unfrozen and the entire model was trained for an additional 50 epochs with a lower learning rate. This fine-tunes the pre-trained features for the specific task of cat vs. dog classification. |
|
|
|
|
|
Key training parameters: |
|
|
- **Optimizer**: `AdamW` |
|
|
- **Loss Function**: `binary_crossentropy` |
|
|
- **Learning Rate Schedule**: `CosineDecayRestarts` |
|
|
- **Metrics**: `accuracy`, `AUC` |
|
|
- **Batch Size**: 16 |
|
|
|
|
|
## Evaluation Results |
|
|
|
|
|
The model was evaluated on a test set of 3,512 images, achieving the following performance: |
|
|
|
|
|
| Metric | Value | |
|
|
|----------|--------| |
|
|
| Loss | 0.0338 | |
|
|
| Accuracy | 99.54% | |
|
|
| AUC | 0.9994 | |
|
|
|
|
|
## How to Use |
|
|
|
|
|
You can use this model for inference with TensorFlow and Keras. |
|
|
|
|
|
First, make sure you have TensorFlow installed: |
|
|
```bash |
|
|
pip install tensorflow |
|
|
``` |
|
|
|
|
|
Then, you can load the model and use it to predict on a new image: |
|
|
|
|
|
```python |
|
|
import tensorflow as tf |
|
|
import numpy as np |
|
|
from tensorflow.keras.preprocessing import image |
|
|
|
|
|
model = tf.keras.models.load_model('path/to/your/model.keras') |
|
|
|
|
|
img_path = 'path/to/your/image.jpg' |
|
|
img = image.load_img(img_path, target_size=(240, 240)) |
|
|
img_array = image.img_to_array(img) |
|
|
img_array = np.expand_dims(img_array, axis=0) |
|
|
|
|
|
preprocessed_img = tf.keras.applications.efficientnet.preprocess_input(img_array) |
|
|
|
|
|
prediction = model.predict(preprocessed_img) |
|
|
score = prediction[0][0] |
|
|
|
|
|
print( |
|
|
f"This image is {100 * (1 - score):.2f}% cat and {100 * score:.2f}% dog." |
|
|
) |
|
|
``` |
|
|
**Note**: The model outputs a single value between 0 and 1. A value closer to 0 indicates a 'cat', and a value closer to 1 indicates a 'dog'. The exact labels depend on how they were encoded during training (e.g., cat=0, dog=1). |
|
|
|
|
|
## Dataset Credits |
|
|
|
|
|
The training data is the publicly available |
|
|
[microsoft/cats_vs_dogs](https://huggingface.co/datasets/microsoft/cats_vs_dogs) |
|
|
dataset (originally the Asirra CAPTCHA dataset). **Huge thanks** to Microsoft |
|
|
Research and Petfinder.com for releasing the images! |
|
|
|
|
|
``` |
|
|
@misc{microsoftcatsdogs, |
|
|
title = {Cats vs. Dogs Image Dataset}, |
|
|
author = {Microsoft Research & Petfinder.com}, |
|
|
howpublished = {HuggingFace Hub}, |
|
|
url = {https://huggingface.co/datasets/microsoft/cats_vs_dogs} |
|
|
} |
|
|
``` |
|
|
|
|
|
## Acknowledgements |
|
|
|
|
|
* TensorFlow/Keras team for the excellent deep-learning framework. |
|
|
* Mingxing Tan & Quoc V. Le for EfficientNet. |
|
|
* The Hugging Face community for the awesome Model & Dataset hubs. |