Add files using upload-large-folder tool

Browse files

Files changed (4) hide show

README.md +203 -3
label_encoder.joblib +3 -0
linearsvc.joblib +3 -0
platt_calibrator.joblib +3 -0

README.md CHANGED Viewed

@@ -1,3 +1,203 @@
----
-license: mit
----

+---
+license: mit
+pipeline_tag: image-classification
+base_model: openai/clip-vit-base-patch32
+tags:
+  - architecture
+  - buildings
+  - art-history
+  - clip
+  - sklearn
+  - image-classification
+datasets:
+  - axel-riben/arcdataset-brutalism-extension
+metrics:
+  - accuracy
+  - f1
+model-index:
+  - name: clip-arch-classifier
+    results:
+      - task:
+          type: image-classification
+        dataset:
+          name: Architectural Styles Dataset (Curated and Extended)
+          type: axel-riben/arcdataset-brutalism-extension
+        metrics:
+          - type: accuracy
+            value: 0.7616
+            name: Top-1 Accuracy
+          - type: accuracy
+            value: 0.9261
+            name: Top-3 Accuracy
+          - type: f1
+            value: 0.7577
+            name: Macro F1
+---
+# clip-arch-classifier
+Architectural style image classifier built on frozen [CLIP ViT-B/32](https://huggingface.co/openai/clip-vit-base-patch32) embeddings.
+Classifies exterior building photographs into **26 architectural styles**. The classifier is a LinearSVC fitted on 512-dim L2-normalised CLIP image embeddings, with a Platt calibrator (logistic regression) on top to produce interpretable probabilities.
+---
+## Model description
+| Component | Detail |
+|---|---|
+| Feature extractor | CLIP ViT-B/32 (`openai/clip-vit-base-patch32`) — frozen |
+| Embedding dim | 512, L2-normalised |
+| Classifier | `sklearn.svm.LinearSVC` (C=1, balanced class weights) |
+| Calibration | Platt scaling — `sklearn.linear_model.LogisticRegression` fitted on val-set decision scores |
+| Training date | 2026-05-08 |
+| Random seed | 42 |
+### Files
+| File | Description |
+|---|---|
+| `linearsvc.joblib` | Fitted LinearSVC |
+| `label_encoder.joblib` | sklearn LabelEncoder (integer ↔ class name) |
+| `platt_calibrator.joblib` | Platt calibrator — use this for `predict_proba` |
+---
+## Training data
+Trained on the [Architectural Styles Dataset (Curated and Extended)](https://huggingface.co/datasets/axel-riben/arcdataset-brutalism-extension): 9,767 images across 26 classes, split 70/15/15 train/val/test (stratified, seed 42).
+The 26 classes are: Achaemenid, American Craftsman, American Foursquare, Ancient Egyptian, Art Deco, Art Nouveau, Baroque, Bauhaus, Beaux-Arts, Brutalism, Byzantine, Chicago school, Colonial, Deconstructivism, Edwardian, Georgian, Gothic, Greek Revival, International style, Novelty, Palladian, Postmodern, Queen Anne, Romanesque, Russian Revival, Tudor Revival.
+---
+## Evaluation
+**Test set: 1,489 images (held-out, never seen during training or calibration)**
+| Metric | Value |
+|---|---|
+| Top-1 accuracy | 0.7616 |
+| Top-3 accuracy | 0.9261 |
+| Top-5 accuracy | 0.9664 |
+| Macro F1 | 0.7577 |
+| Weighted F1 | 0.7582 |
+### Per-class F1 (test set)
+| Class | F1 | Support |
+|---|---|---|
+| Ancient Egyptian architecture | 0.952 | 53 |
+| Achaemenid architecture | 0.938 | 55 |
+| Novelty architecture | 0.920 | 54 |
+| Gothic architecture | 0.915 | 47 |
+| Brutalism architecture | 0.867 | 44 |
+| Deconstructivism | 0.872 | 44 |
+| Russian Revival architecture | 0.844 | 49 |
+| Chicago school architecture | 0.824 | 39 |
+| Art Nouveau architecture | 0.813 | 90 |
+| Romanesque architecture | 0.805 | 44 |
+| Byzantine architecture | 0.795 | 45 |
+| Queen Anne architecture | 0.793 | 107 |
+| Greek Revival architecture | 0.776 | 76 |
+| Tudor Revival architecture | 0.776 | 65 |
+| Art Deco architecture | 0.764 | 83 |
+| Baroque architecture | 0.740 | 66 |
+| American Foursquare architecture | 0.732 | 53 |
+| Postmodern architecture | 0.674 | 47 |
+| Bauhaus architecture | 0.674 | 45 |
+| American craftsman style | 0.698 | 52 |
+| Georgian architecture | 0.634 | 53 |
+| Beaux-Arts architecture | 0.650 | 61 |
+| Colonial architecture | 0.610 | 68 |
+| International style | 0.561 | 59 |
+| Palladian architecture | 0.547 | 49 |
+| Edwardian architecture | 0.526 | 41 |
+### Most-confused pairs
+| True class | Predicted as | Confusion rate |
+|---|---|---|
+| International style | Bauhaus architecture | 27.1 % |
+| Postmodern architecture | International style | 17.0 % |
+| American craftsman style | American Foursquare | 15.4 % |
+| Palladian architecture | Greek Revival architecture | 14.3 % |
+| Byzantine architecture | Russian Revival architecture | 13.3 % |
+---
+## Intended use
+- Classifying exterior building photographs by architectural style
+- Educational and research use in architectural history and computer vision
+- Input to downstream retrieval or recommendation systems
+**Not intended for:**
+- Interior photographs, architectural renders, or drawings
+- Styles not in the 26-class vocabulary
+- High-stakes decisions without human review
+---
+## Limitations
+- **Weak classes:** Edwardian (F1 = 0.53), Palladian (0.55), and International style (0.56) are the least reliable; treat their predictions as soft signals
+- **Style overlap:** International ↔ Bauhaus and Postmodern ↔ International confusions reflect genuine art-historical ambiguity, not purely model error
+- **Geographic bias:** training data is heavily Western/European
+- **Modality:** trained exclusively on exterior photographs; performance on interiors and non-photographic images is undefined
+- **Leakage caveat:** Ancient Egyptian and Novelty classes contain multiple photographs of the same landmark buildings; their F1 scores are likely slightly optimistic
+---
+## Usage
+```python
+import joblib
+import torch
+import torch.nn.functional as F
+from PIL import Image
+from transformers import CLIPModel, CLIPProcessor
+from huggingface_hub import hf_hub_download
+REPO_ID = "axel-riben/clip-arch-classifier"
+# Load CLIP
+processor = CLIPProcessor.from_pretrained("openai/clip-vit-base-patch32")
+clip      = CLIPModel.from_pretrained("openai/clip-vit-base-patch32").eval()
+# Load classifier and calibrator
+svc   = joblib.load(hf_hub_download(REPO_ID, "linearsvc.joblib"))
+platt = joblib.load(hf_hub_download(REPO_ID, "platt_calibrator.joblib"))
+# Predict
+image  = Image.open("building.jpg").convert("RGB")
+inputs = processor(images=image, return_tensors="pt")
+with torch.no_grad():
+    feats = clip.get_image_features(**inputs)
+if not isinstance(feats, torch.Tensor):
+    feats = feats.pooler_output
+emb = F.normalize(feats, dim=-1).numpy()
+scores = svc.decision_function(emb)          # (1, 26)
+probs  = platt.predict_proba(scores)[0]      # (26,)
+top5 = sorted(zip(platt.classes_, probs), key=lambda x: -x[1])[:5]
+for label, prob in top5:
+    print(f"{prob:.3f}  {label}")
+```
+---
+## Citation
+If you use this model, please also cite the original dataset:
+```
+Danci, Marian Dumitru/dumitrux. (n.d.). Architectural Styles Dataset [Data set].
+Kaggle. https://www.kaggle.com/datasets/dumitrux/architectural-styles-dataset
+```
+## License
+Code and model weights: MIT.
+Training data licences: see the [dataset card](https://huggingface.co/datasets/axel-riben/arcdataset-brutalism-extension).

label_encoder.joblib ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:71289bb21b66c5b853b20c8661676b8551e71695b96e530396738e5aa5f35319
+size 3655

linearsvc.joblib ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:c30c19533232c45b6c3f26fd61c9e47a05dfb002e70acba2ec5f3e4217fa2df9
+size 107635

platt_calibrator.joblib ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:98d98d19a48f093c693215404459510991030a31b3caaa54ed1ce0ac66d1664b
+size 9767