--- license: mit tags: - image-classification - cats-vs-dogs - tensorflow - keras - efficientnet datasets: - microsoft/cats_vs_dogs metrics: - accuracy - auc - loss --- # Cat vs. Dog Image Classification This is a Keras image classification model trained to distinguish between images of cats and dogs. The model is based on the `EfficientNetB1` architecture and was trained on a dataset of cat and dog images. ## Model Architecture The model uses `EfficientNetB1` pre-trained on ImageNet as its base. The architecture is as follows: 1. **Input Layer**: Accepts images of size `(240, 240, 3)`. 2. **Data Augmentation**: Applies random transformations to the input images to improve generalization: * `RandomFlip("horizontal")` * `RandomRotation(0.1)` * `RandomZoom(0.1)` * `RandomContrast(0.1)` * `RandomBrightness(0.1)` 3. **Base Model**: `EfficientNetB1` (with weights frozen during the initial training phase). 4. **Classification Head**: * `GlobalAveragePooling2D` * `Dropout(0.2)` * `Dense(1, activation="sigmoid")` for binary classification. ## Training Procedure The model was trained in two stages: 1. **Transfer Learning**: The `EfficientNetB1` base was frozen, and only the classification head was trained for 50 epochs. This allows the model to learn to classify cats and dogs using the features learned from ImageNet. 2. **Fine-Tuning**: The top 20 layers of the `EfficientNetB1` base were unfrozen and the entire model was trained for an additional 50 epochs with a lower learning rate. This fine-tunes the pre-trained features for the specific task of cat vs. dog classification. Key training parameters: - **Optimizer**: `AdamW` - **Loss Function**: `binary_crossentropy` - **Learning Rate Schedule**: `CosineDecayRestarts` - **Metrics**: `accuracy`, `AUC` - **Batch Size**: 16 ## Evaluation Results The model was evaluated on a test set of 3,512 images, achieving the following performance: | Metric | Value | |----------|--------| | Loss | 0.0338 | | Accuracy | 99.54% | | AUC | 0.9994 | ## How to Use You can use this model for inference with TensorFlow and Keras. First, make sure you have TensorFlow installed: ```bash pip install tensorflow ``` Then, you can load the model and use it to predict on a new image: ```python import tensorflow as tf import numpy as np from tensorflow.keras.preprocessing import image model = tf.keras.models.load_model('path/to/your/model.keras') img_path = 'path/to/your/image.jpg' img = image.load_img(img_path, target_size=(240, 240)) img_array = image.img_to_array(img) img_array = np.expand_dims(img_array, axis=0) preprocessed_img = tf.keras.applications.efficientnet.preprocess_input(img_array) prediction = model.predict(preprocessed_img) score = prediction[0][0] print( f"This image is {100 * (1 - score):.2f}% cat and {100 * score:.2f}% dog." ) ``` **Note**: The model outputs a single value between 0 and 1. A value closer to 0 indicates a 'cat', and a value closer to 1 indicates a 'dog'. The exact labels depend on how they were encoded during training (e.g., cat=0, dog=1). ## Dataset Credits The training data is the publicly available [microsoft/cats_vs_dogs](https://huggingface.co/datasets/microsoft/cats_vs_dogs) dataset (originally the Asirra CAPTCHA dataset). **Huge thanks** to Microsoft Research and Petfinder.com for releasing the images! ``` @misc{microsoftcatsdogs, title = {Cats vs. Dogs Image Dataset}, author = {Microsoft Research & Petfinder.com}, howpublished = {HuggingFace Hub}, url = {https://huggingface.co/datasets/microsoft/cats_vs_dogs} } ``` ## Acknowledgements * TensorFlow/Keras team for the excellent deep-learning framework. * Mingxing Tan & Quoc V. Le for EfficientNet. * The Hugging Face community for the awesome Model & Dataset hubs.