BrejBala
/

DogBreedClassification

+---
+language: en
+license: mit
+library_name: tensorflow
+pipeline_tag: image-classification
+tags:
+  - tensorflow
+  - keras
+  - tensorflow-hub
+  - mobilenetv2
+  - transfer-learning
+  - computer-vision
+  - image-classification
+  - multi-class-classification
+  - dog-breed-classification
+  - kaggle-competition
+metrics:
+  - accuracy
+  - log_loss
+model-index:
+  - name: Dog Breed Classification (TF Hub MobileNetV2 + Dense)
+    results:
+      - task:
+          type: image-classification
+          name: Image Classification
+        dataset:
+          name: Kaggle Dog Breed Identification (labels.csv + train images)
+          type: image
+        metrics:
+          - name: Validation Accuracy (subset experiment)
+            type: accuracy
+            value: 0.7750
+          - name: Validation Loss (subset experiment)
+            type: loss
+            value: 0.8411
+---
+# 🐶 Dog Breed Classification (TensorFlow Hub MobileNetV2)
+This model predicts the **dog breed (120 classes)** from an input image using **transfer learning** with a pretrained **MobileNetV2** model from **TensorFlow Hub**, plus a custom dense softmax classifier head.
+It is built as an end-to-end computer vision pipeline: data loading → preprocessing → batching with `tf.data` → training with callbacks → evaluation/visualization → saving/loading → Kaggle-style probabilistic submission generation.
+## Model Details
+- Developed by: brej-29
+- Model type: TensorFlow / Keras `Sequential`
+  - Base: TF Hub MobileNetV2 ImageNet classifier
+  - Head: `Dense(120, activation="softmax")`
+- Task: Multi-class image classification (120 dog breeds)
+- Output: Probability distribution over 120 breeds (softmax)
+- Input: RGB image resized to 224×224, normalized to [0, 1]
+- Training notebook: `DogBreedClassification.ipynb`
+- Source repo: https://github.com/brej-29/Logicmojo-AIML-Assignments-DogBreedClassificationTensorFlow
+- License: MIT
+## Intended Use
+- Educational / portfolio demonstration of transfer learning + end-to-end deep learning workflow
+- Baseline experiments for multi-class dog breed recognition
+- Generating probabilistic predictions for Kaggle-style submissions
+### Out-of-scope / Not suitable for
+- Safety-critical or production use without further validation, monitoring, and retraining
+- Use on non-dog images or heavily out-of-distribution images (e.g., cartoons, low-light, extreme blur) without robustness testing
+## Training Data
+- Dataset: Kaggle “Dog Breed Identification”
+  - Training images: 10,222
+  - Classes: 120 dog breeds
+  - Labels file: `labels.csv` (maps `id` → `breed`)
+Note: Kaggle’s official competition metric is **log loss** (requires calibrated class probabilities). This project produces probabilistic outputs suitable for that metric, but offline log loss computation is not explicitly reported in the notebook.
+## Preprocessing
+Image preprocessing applied during training/inference:
+- Read JPG from filepath
+- Decode to RGB tensor
+- Convert dtype to float32 and normalize to [0, 1]
+- Resize to **224×224**
+Efficient input pipeline:
+- Training batches use shuffling and `tf.data` batching
+- Validation batches avoid shuffling
+- Test batches contain filepaths only (no labels)
+## Label Encoding / Class Order (Important)
+- Labels are one-hot encoded based on:
+  - `unique_breeds = np.unique(labels)` (alphabetical order by default for NumPy unique)
+- The model’s output index `i` corresponds to `unique_breeds[i]`
+To ensure correct decoding of predictions on the Hub, you should provide the class list (e.g., `class_names.json` or `unique_breeds.txt`) in the model repository.
+## Training Procedure
+- Framework: TensorFlow 2.x / Keras
+- Base model URL (TF Hub):
+  - `https://tfhub.dev/google/imagenet/mobilenet_v2_130_224/classification/4`
+- Loss: `CategoricalCrossentropy`
+- Optimizer: `Adam`
+- Metrics: `accuracy`
+- Callbacks:
+  - TensorBoard logging
+  - EarlyStopping
+    - Subset training monitors `val_accuracy` (patience=3)
+    - Full training (no validation set) monitors `accuracy` (patience=3)
+### Subset Experiment (for fast iteration)
+- Subset size: 2,000 images
+- Split: 80% train / 20% validation (`random_state=42`)
+- Epochs configured: 100 (with EarlyStopping)
+### Full Training
+- The notebook also trains on the full dataset to generate Kaggle-style predictions.
+- Since the full run does not use a dedicated validation set, validation metrics are not reported for that phase.
+## Evaluation
+Reported evaluation (subset experiment; validation split from first 2,000 images):
+- Validation Accuracy: **0.7750**
+- Validation Loss: **0.8411**
+Important: This is a quick experiment metric and may not represent final performance on the full dataset or on real-world dog images.
+## How to Use
+The recommended approach is:
+1) Download the saved model artifact from the Hub
+2) Apply the same preprocessing (resize 224×224, normalize)
+3) Run `model.predict()`
+4) Decode the top-k indices using the stored class list (same order as training)
+Example (update filenames to match your uploaded artifacts):
+    import json
+    import numpy as np
+    import tensorflow as tf
+    import tensorflow_hub as hub
+    from huggingface_hub import hf_hub_download
+    repo_id = "YOUR_USERNAME/YOUR_MODEL_REPO"
+    # 1) Download model (example: H5)
+    model_path = hf_hub_download(repo_id=repo_id, filename="dog_breed_mobilenetv2.h5")
+    model = tf.keras.models.load_model(
+        model_path,
+        custom_objects={"KerasLayer": hub.KerasLayer},
+        compile=False
+    )
+    # 2) Download class names (recommended to upload alongside the model)
+    classes_path = hf_hub_download(repo_id=repo_id, filename="class_names.json")
+    class_names = json.load(open(classes_path, "r"))
+    # 3) Preprocess a single image
+    def preprocess_image(path, img_size=224):
+        img = tf.io.read_file(path)
+        img = tf.image.decode_jpeg(img, channels=3)
+        img = tf.image.convert_image_dtype(img, tf.float32)
+        img = tf.image.resize(img, [img_size, img_size])
+        return tf.expand_dims(img, axis=0)  # add batch dim
+    x = preprocess_image("your_dog.jpg")
+    probs = model.predict(x)[0]
+    # 4) Top-5 predictions
+    top5 = probs.argsort()[-5:][::-1]
+    for idx in top5:
+        print(class_names[idx], float(probs[idx]))
+If you uploaded a TensorFlow SavedModel folder instead of an `.h5` file, download the folder files and load with `tf.keras.models.load_model(...)` accordingly.
+## Input Requirements
+- Input type: RGB images (JPG/PNG supported if decoded to RGB)
+- Image size: **224×224**
+- Value range: float32 normalized to **[0, 1]**
+- Output decoding must use the same class order used during training (`np.unique(labels)` order)
+## Bias, Risks, and Limitations
+- Dataset bias: model is trained on a specific Kaggle dataset; results may not generalize to all real-world photos
+- Class ambiguity: many dog breeds look visually similar; mistakes are expected
+- Out-of-distribution risk: performance may drop significantly on unusual lighting, occlusions, non-dog animals, mixed breeds, or stylized images
+- Label-order dependency: wrong class mapping will produce incorrect breed names even if probabilities are correct
+## Environmental Impact
+Transfer learning with MobileNetV2 is relatively compute-efficient compared to training a CNN from scratch. Training can be done on GPU for speed, but overall footprint is modest for a model of this size.
+## Technical Specifications
+- Framework: TensorFlow 2.x / Keras
+- Base model: TF Hub MobileNetV2 (ImageNet pretrained)
+- Head: Dense softmax classifier (120 units)
+- Task: image-classification
+- Recommended runtime: CPU (inference) / GPU (training)
+## Model Card Authors
+- BrejBala
+## Contact
+For questions/feedback, please open an issue on the GitHub repository:
+https://github.com/brej-29/Logicmojo-AIML-Assignments-DogBreedClassificationTensorFlow