0xgr3y
/

Arch-Building-Image-Classification

@@ -2,10 +2,197 @@
 license: apache-2.0
 pipeline_tag: image-classification
 tags:
-- TensorFlow,
-- feature-extraction,
-- densenet121,
-- architectural,
-- building,
-- CNN,
----

 license: apache-2.0
 pipeline_tag: image-classification
 tags:
+  - tensorflow
+  - keras
+  - image-classification
+  - densenet121
+  - architecture
+  - building
+  - cnn
+  - fgvc
+  - transfer-learning
+  - gem-pooling
+  - swa
+library_name: keras
+language: en
+datasets:
+  - Saugani/arch-building-dataset
+widget:
+  - structure:
+      src: https://cdn-uploads.huggingface.co/production/uploads/66cdac913f233bf2c7b4f590/HzXxNze2jmCkV5KPY_fpQ.png
+      example_title: Bridge Classification
+---
+# Arch Building Image Classifier
+<table>
+  <tr>
+    <td><strong>Architecture</strong></td>
+    <td>DenseNet121 + GeM Pooling (p=3.0) + SWA</td>
+  </tr>
+  <tr>
+    <td><strong>Task</strong></td>
+    <td>Fine-Grained Visual Categorization (FGVC)</td>
+  </tr>
+  <tr>
+    <td><strong>Test Accuracy</strong></td>
+    <td>96.23% (970/1,008)</td>
+  </tr>
+  <tr>
+    <td><strong>Classes</strong></td>
+    <td>6 (Bridge, Castle, Mosque, Skyscraper, Stadium, Temple)</td>
+  </tr>
+  <tr>
+    <td><strong>Input Size</strong></td>
+    <td>320 x 320 pixels</td>
+  </tr>
+  <tr>
+    <td><strong>Framework</strong></td>
+    <td>TensorFlow / Keras 3</td>
+  </tr>
+  <tr>
+    <td><strong>License</strong></td>
+    <td><a href="https://www.apache.org/licenses/LICENSE-2.0">Apache-2.0</a></td>
+  </tr>
+</table>
+## Model Description
+A fine-grained image classification model for world architectural buildings. Built on DenseNet121 pretrained on ImageNet, enhanced with GeM Pooling (learnable generalized mean pooling), Focal Loss, and Stochastic Weight Averaging (SWA).
+**Key architectural innovations:**
+- **GeM Pooling (p=3.0)** — replaces global average pooling with learnable power parameter, better for fine-grained discrimination
+- **Focal Loss (gamma=2.0)** — focuses on hard-to-classify building pairs
+- **DiscriminativeAdamW** — per-layer learning rate multipliers for backbone layers
+- **SWA with BN re-estimation** — weight averaging for improved generalization
+## Architecture
+```
+Input (320, 320, 3)
+  |
+  DenseNet121 (ImageNet, 8M params)
+  |
+  Conv2D(256, 3x3, ReLU, padding=same)
+  BatchNormalization
+  MaxPooling2D(2x2)
+  |
+  GeM Pooling(p=3.0, eps=1e-6, learnable)
+  |
+  Dense(256, ReLU)
+  BatchNormalization
+  Dropout(0.4)
+  |
+  Dense(6, Softmax)
+  |
+Output (6 classes)
+```
+## Performance
+| Metric | Value |
+|--------|-------|
+| Test Accuracy | **96.23%** |
+| Validation Accuracy (SWA) | **95.93%** |
+| Test-Time Augmentation | **96.33%** |
+| Overfitting Gap | 3.22% |
+### Per-Class Results
+| Class | F1 Score | Recall |
+|-------|----------|--------|
+| Bridge | 95.29% | 96.43% |
+| Castle | 97.92% | 98.21% |
+| Mosque | 95.93% | 98.21% |
+| Skyscraper | 97.95% | 99.40% |
+| Stadium | 94.12% | 90.48% |
+| Temple | 96.07% | 94.64% |
+## Training
+- **Dataset:** 10,080 images (1,680 per class, balanced) from Pexels
+- **Split:** 80/10/10 (train/val/test), seed=42
+- **Phase 1:** Feature extraction, AdamW LR=0.001, CutMix+Mixup
+- **Phase 2:** Selective fine-tuning conv4+conv5, DiscriminativeAdamW
+- **Post-training:** SWA 5 epochs + BN re-estimation
+## Files
+| File | Description |
+|------|-------------|
+| `best_phase2_swa.keras` | Best model — SWA averaged weights (val_acc=95.93%) |
+| `best_phase2.keras` | Phase 2 checkpoint (val_acc=93.35%) |
+| `config.json` | Full model configuration and evaluation metrics |
+| `label_mapping.json` | Class name <-> ID mapping |
+| `preprocessor_config.json` | Input preprocessing specification |
+## Usage
+### Gradio Demo
+Try the live demo: [arch-building-classifier Space](https://huggingface.co/spaces/0xgr3y/arch-building-classifier)
+### Python
+```python
+from huggingface_hub import hf_hub_download
+import tensorflow as tf
+from tensorflow.keras.applications.densenet import preprocess_input
+from tensorflow.keras.layers import Layer
+from tensorflow.keras.optimizers import Optimizer
+from PIL import Image
+import numpy as np
+# --- Custom layers (must match training definition) ---
+class GeMPooling(Layer):
+    def __init__(self, p=3.0, eps=1e-6, **kwargs):
+        super().__init__(**kwargs)
+        self.p_init = p
+        self.eps = eps
+    def build(self, input_shape):
+        self.p = self.add_weight(name="gem_p", shape=(), dtype=tf.float32,
+            initializer=tf.keras.initializers.Constant(self.p_init), trainable=True)
+        super().build(input_shape)
+    def call(self, x):
+        x = tf.clip_by_value(x, self.eps, tf.reduce_max(x))
+        x = tf.pow(x, self.p)
+        x = tf.reduce_mean(x, axis=[1, 2], keepdims=False)
+        return tf.pow(x, 1.0 / self.p)
+    def get_config(self):
+        return {**super().get_config(), "p": self.p_init, "eps": self.eps}
+# ... (FocalLoss, DiscriminativeAdamW definitions) ...
+custom_objects = {"GeMPooling": GeMPooling, "FocalLoss": FocalLoss, "DiscriminativeAdamW": DiscriminativeAdamW}
+model_path = hf_hub_download("0xgr3y/Arch-Building-Image-Classification", "best_phase2_swa.keras")
+model = tf.keras.models.load_model(model_path, custom_objects=custom_objects, compile=False)
+img = Image.open("building.jpg").convert("RGB").resize((320, 320))
+arr = np.expand_dims(preprocess_input(np.array(img, dtype=np.float32)), axis=0)
+preds = model.predict(arr, verbose=0)[0]
+print(f"Predicted: {LABELS[np.argmax(preds)]} ({np.max(preds)*100:.1f}%)")
+```
+## Intended Use
+- Architectural style classification from building photographs
+- Educational tool for architecture recognition
+- Research baseline for fine-grained visual categorization
+## Limitations
+- Trained on Pexels stock photography — may perform differently on user-generated photos
+- Limited to 6 architectural classes
+- Temple class has highest confusion rate (often misclassified as Mosque)
+## Citation
+```bibtex
+@misc{saugani2024_arch_building,
+  title={Architecture Building Image Classifier: FGVC with DenseNet121 + GeM Pooling + SWA},
+  author={Saugani},
+  year={2024},
+  publisher={Hugging Face},
+  url={https://huggingface.co/0xgr3y/Arch-Building-Image-Classification}
+}
+```