0xgr3y commited on
Commit
d11a9c0
·
verified ·
1 Parent(s): 5236f1b

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +14 -12
README.md CHANGED
@@ -46,15 +46,17 @@ model-index:
46
  name: TTA Accuracy
47
  ---
48
 
49
- # Fine-Grained World Architecture Image Classification: A DenseNet121 Transfer Learning Approach with Layered Regularization
 
 
50
 
51
  ### Architectural Building Image Classifier
52
 
53
- Fine-Grained Visual Classification (FGVC) of world architectural buildings using CNN transfer learning with DenseNet121, enhanced with GeM Pooling, Focal Loss, Discriminative AdamW (LR), Stochastic Weight Averaging (SWA), Grad-CAM explainability, and calibration analysis.
54
 
55
  <table>
56
  <tr><td><strong>Architecture</strong></td><td>DenseNet121 + GeM Pooling + Focal Loss + SWA</td></tr>
57
- <tr><td><strong>Task</strong></td><td>Fine-Grained Visual Classification (FGVC)</td></tr>
58
  <tr><td><strong>Test Accuracy</strong></td><td>96.88%</td></tr>
59
  <tr><td><strong>Classes</strong></td><td>8 (Barn, Bridge, Castle, Mosque, Skyscraper, Stadium, Temple, Windmill)</td></tr>
60
  <tr><td><strong>Input Size</strong></td><td>320 × 320 pixels</td></tr>
@@ -69,7 +71,7 @@ A fine-grained image classification model for world architectural buildings. Bui
69
 
70
  **Key architectural contributions:**
71
 
72
- - **GeM Pooling** (Radenovic et al., CVPR 2018) — replaces global average pooling with a learnable power parameter (p=3.0) that emphasizes high-activation features, yielding stronger discriminative representations for FGVC tasks
73
  - **Focal Loss** (Lin et al., ICCV 2017, gamma=2.0) — down-weights well-classified examples to focus gradient updates on hard-to-classify building pairs
74
  - **DiscriminativeAdamW** — extends AdamW with per-layer learning rate multipliers: conv4_block receives LR × 0.1 (pretrained features require smaller updates), while conv5_block and the custom head receive LR × 1.0
75
  - **SWA with BN re-estimation** (Izmailov et al., UAI 2018) — 10-epoch post-training weight averaging with constant LR 1e-4, followed by 100-step batch normalization statistics re-estimation
@@ -192,9 +194,9 @@ Four candidate models were evaluated on the validation set:
192
 
193
  ## Training Details
194
 
195
- ### Training Strategy
196
 
197
- Two-phase progressive training with SWA post-processing:
198
 
199
  | Phase | Description | Backbone | Optimizer | LR | Max Epochs | Actual Epochs | CutMix+Mixup | FocalLoss LS |
200
  |-------|-------------|----------|-----------|-----|-----------|---------------|---------------|-------------|
@@ -249,7 +251,7 @@ Two-phase progressive training with SWA post-processing:
249
 
250
  ### Dataset
251
 
252
- [0xgr3y/arch-building-dataset](https://huggingface.co/datasets/0xgr3y/arch-building-dataset) — 13,440 images (8 classes × 1,680, balanced) sourced from Pexels with perceptual (pHash) and exact (SHA256) deduplication.
253
 
254
  | Split | Images | Percentage |
255
  |-------|--------|------------|
@@ -298,7 +300,7 @@ Two-phase progressive training with SWA post-processing:
298
 
299
  ### Gradio Space
300
 
301
- Try the live inference: [arch-building-classifier Space](https://huggingface.co/spaces/0xgr3y/arch-building-classifier)
302
 
303
  ### Python — Keras
304
 
@@ -486,7 +488,7 @@ print(f"Predicted: {LABELS[np.argmax(preds)]} ({np.max(preds)*100:.1f}%)")
486
 
487
  - Architectural style classification from building photographs
488
  - Educational tool for architecture recognition
489
- - Research baseline for fine-grained visual classification (FGVC)
490
  - Transfer learning experiments on architectural imagery
491
 
492
  ## Limitations
@@ -510,8 +512,8 @@ print(f"Predicted: {LABELS[np.argmax(preds)]} ({np.max(preds)*100:.1f}%)")
510
  ## Links
511
 
512
  - **Gradio Space (Live):** [arch-building-classifier Space](https://huggingface.co/spaces/0xgr3y/arch-building-classifier)
513
- - **Dataset:** [0xgr3y/arch-building-dataset](https://huggingface.co/datasets/0xgr3y/arch-building-dataset)
514
- - **GitHub:** [arcxteam/arch-building-classifier](https://github.com/arcxteam/arch-building-classifier)
515
 
516
  ## References
517
 
@@ -542,7 +544,7 @@ print(f"Predicted: {LABELS[np.argmax(preds)]} ({np.max(preds)*100:.1f}%)")
542
 
543
  ```bibtex
544
  @misc{saugani2026_arch_building,
545
- title={Fine-Grained World Architecture Image Classification:
546
  A DenseNet121 Transfer Learning Approach with Layered Regularization},
547
  author={Saugani},
548
  year={2026},
 
46
  name: TTA Accuracy
47
  ---
48
 
49
+ ![Arch-Building-Image-Classification](results/greyscope-labs-arch-building-classification-hunggingface-cnn-densenet121.png)
50
+
51
+ # Fine-Grained Image Classification of World Architecture: A DenseNet121 Transfer Learning Approach with Layered Regularization
52
 
53
  ### Architectural Building Image Classifier
54
 
55
+ Fine-Grained Image Classification (FGIC) of world architectural buildings using CNN transfer learning with DenseNet121, enhanced with GeM Pooling, Focal Loss, Discriminative AdamW (LR), Stochastic Weight Averaging (SWA), Grad-CAM explainability, and calibration analysis.
56
 
57
  <table>
58
  <tr><td><strong>Architecture</strong></td><td>DenseNet121 + GeM Pooling + Focal Loss + SWA</td></tr>
59
+ <tr><td><strong>Task</strong></td><td>Fine-Grained Image Classification (FGIC)</td></tr>
60
  <tr><td><strong>Test Accuracy</strong></td><td>96.88%</td></tr>
61
  <tr><td><strong>Classes</strong></td><td>8 (Barn, Bridge, Castle, Mosque, Skyscraper, Stadium, Temple, Windmill)</td></tr>
62
  <tr><td><strong>Input Size</strong></td><td>320 × 320 pixels</td></tr>
 
71
 
72
  **Key architectural contributions:**
73
 
74
+ - **GeM Pooling** (Radenovic et al., CVPR 2018) — replaces global average pooling with a learnable power parameter (p=3.0) that emphasizes high-activation features, yielding stronger discriminative representations for FGIC tasks
75
  - **Focal Loss** (Lin et al., ICCV 2017, gamma=2.0) — down-weights well-classified examples to focus gradient updates on hard-to-classify building pairs
76
  - **DiscriminativeAdamW** — extends AdamW with per-layer learning rate multipliers: conv4_block receives LR × 0.1 (pretrained features require smaller updates), while conv5_block and the custom head receive LR × 1.0
77
  - **SWA with BN re-estimation** (Izmailov et al., UAI 2018) — 10-epoch post-training weight averaging with constant LR 1e-4, followed by 100-step batch normalization statistics re-estimation
 
194
 
195
  ## Training Details
196
 
197
+ ### Progressive Training
198
 
199
+ Two-phase strategy train with SWA post-processing:
200
 
201
  | Phase | Description | Backbone | Optimizer | LR | Max Epochs | Actual Epochs | CutMix+Mixup | FocalLoss LS |
202
  |-------|-------------|----------|-----------|-----|-----------|---------------|---------------|-------------|
 
251
 
252
  ### Dataset
253
 
254
+ See more data studio curation [World Architectural Buildings Dataset for Multi‑Class Image Classification](https://huggingface.co/datasets/0xgr3y/arch-building-dataset) — 13,440 images (8 classes × 1,680, balanced) sourced from Pexels with perceptual (pHash) and exact (SHA256) deduplication.
255
 
256
  | Split | Images | Percentage |
257
  |-------|--------|------------|
 
300
 
301
  ### Gradio Space
302
 
303
+ Try the live classify building: [Architecture Building Image Classifier with Space](https://huggingface.co/spaces/0xgr3y/arch-building-classifier)
304
 
305
  ### Python — Keras
306
 
 
488
 
489
  - Architectural style classification from building photographs
490
  - Educational tool for architecture recognition
491
+ - Research baseline for fine-grained image classification (FGIC)
492
  - Transfer learning experiments on architectural imagery
493
 
494
  ## Limitations
 
512
  ## Links
513
 
514
  - **Gradio Space (Live):** [arch-building-classifier Space](https://huggingface.co/spaces/0xgr3y/arch-building-classifier)
515
+ - **Dataset Studio:** [0xgr3y/arch-building-dataset](https://huggingface.co/datasets/0xgr3y/arch-building-dataset)
516
+ - **GitHub Repo:** [arcxteam/arch-building-classifier](https://github.com/arcxteam/arch-building-classifier)
517
 
518
  ## References
519
 
 
544
 
545
  ```bibtex
546
  @misc{saugani2026_arch_building,
547
+ title={Fine-Grained Image Classification of World Architecture:
548
  A DenseNet121 Transfer Learning Approach with Layered Regularization},
549
  author={Saugani},
550
  year={2026},