axel-riben commited on
Commit
f84e46a
·
verified ·
1 Parent(s): 6fa2775

Add files using upload-large-folder tool

Browse files
Files changed (4) hide show
  1. README.md +203 -3
  2. label_encoder.joblib +3 -0
  3. linearsvc.joblib +3 -0
  4. platt_calibrator.joblib +3 -0
README.md CHANGED
@@ -1,3 +1,203 @@
1
- ---
2
- license: mit
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: mit
3
+ pipeline_tag: image-classification
4
+ base_model: openai/clip-vit-base-patch32
5
+ tags:
6
+ - architecture
7
+ - buildings
8
+ - art-history
9
+ - clip
10
+ - sklearn
11
+ - image-classification
12
+ datasets:
13
+ - axel-riben/arcdataset-brutalism-extension
14
+ metrics:
15
+ - accuracy
16
+ - f1
17
+ model-index:
18
+ - name: clip-arch-classifier
19
+ results:
20
+ - task:
21
+ type: image-classification
22
+ dataset:
23
+ name: Architectural Styles Dataset (Curated and Extended)
24
+ type: axel-riben/arcdataset-brutalism-extension
25
+ metrics:
26
+ - type: accuracy
27
+ value: 0.7616
28
+ name: Top-1 Accuracy
29
+ - type: accuracy
30
+ value: 0.9261
31
+ name: Top-3 Accuracy
32
+ - type: f1
33
+ value: 0.7577
34
+ name: Macro F1
35
+ ---
36
+
37
+ # clip-arch-classifier
38
+
39
+ Architectural style image classifier built on frozen [CLIP ViT-B/32](https://huggingface.co/openai/clip-vit-base-patch32) embeddings.
40
+
41
+ Classifies exterior building photographs into **26 architectural styles**. The classifier is a LinearSVC fitted on 512-dim L2-normalised CLIP image embeddings, with a Platt calibrator (logistic regression) on top to produce interpretable probabilities.
42
+
43
+ ---
44
+
45
+ ## Model description
46
+
47
+ | Component | Detail |
48
+ |---|---|
49
+ | Feature extractor | CLIP ViT-B/32 (`openai/clip-vit-base-patch32`) — frozen |
50
+ | Embedding dim | 512, L2-normalised |
51
+ | Classifier | `sklearn.svm.LinearSVC` (C=1, balanced class weights) |
52
+ | Calibration | Platt scaling — `sklearn.linear_model.LogisticRegression` fitted on val-set decision scores |
53
+ | Training date | 2026-05-08 |
54
+ | Random seed | 42 |
55
+
56
+ ### Files
57
+
58
+ | File | Description |
59
+ |---|---|
60
+ | `linearsvc.joblib` | Fitted LinearSVC |
61
+ | `label_encoder.joblib` | sklearn LabelEncoder (integer ↔ class name) |
62
+ | `platt_calibrator.joblib` | Platt calibrator — use this for `predict_proba` |
63
+
64
+ ---
65
+
66
+ ## Training data
67
+
68
+ Trained on the [Architectural Styles Dataset (Curated and Extended)](https://huggingface.co/datasets/axel-riben/arcdataset-brutalism-extension): 9,767 images across 26 classes, split 70/15/15 train/val/test (stratified, seed 42).
69
+
70
+ The 26 classes are: Achaemenid, American Craftsman, American Foursquare, Ancient Egyptian, Art Deco, Art Nouveau, Baroque, Bauhaus, Beaux-Arts, Brutalism, Byzantine, Chicago school, Colonial, Deconstructivism, Edwardian, Georgian, Gothic, Greek Revival, International style, Novelty, Palladian, Postmodern, Queen Anne, Romanesque, Russian Revival, Tudor Revival.
71
+
72
+ ---
73
+
74
+ ## Evaluation
75
+
76
+ **Test set: 1,489 images (held-out, never seen during training or calibration)**
77
+
78
+ | Metric | Value |
79
+ |---|---|
80
+ | Top-1 accuracy | 0.7616 |
81
+ | Top-3 accuracy | 0.9261 |
82
+ | Top-5 accuracy | 0.9664 |
83
+ | Macro F1 | 0.7577 |
84
+ | Weighted F1 | 0.7582 |
85
+
86
+ ### Per-class F1 (test set)
87
+
88
+ | Class | F1 | Support |
89
+ |---|---|---|
90
+ | Ancient Egyptian architecture | 0.952 | 53 |
91
+ | Achaemenid architecture | 0.938 | 55 |
92
+ | Novelty architecture | 0.920 | 54 |
93
+ | Gothic architecture | 0.915 | 47 |
94
+ | Brutalism architecture | 0.867 | 44 |
95
+ | Deconstructivism | 0.872 | 44 |
96
+ | Russian Revival architecture | 0.844 | 49 |
97
+ | Chicago school architecture | 0.824 | 39 |
98
+ | Art Nouveau architecture | 0.813 | 90 |
99
+ | Romanesque architecture | 0.805 | 44 |
100
+ | Byzantine architecture | 0.795 | 45 |
101
+ | Queen Anne architecture | 0.793 | 107 |
102
+ | Greek Revival architecture | 0.776 | 76 |
103
+ | Tudor Revival architecture | 0.776 | 65 |
104
+ | Art Deco architecture | 0.764 | 83 |
105
+ | Baroque architecture | 0.740 | 66 |
106
+ | American Foursquare architecture | 0.732 | 53 |
107
+ | Postmodern architecture | 0.674 | 47 |
108
+ | Bauhaus architecture | 0.674 | 45 |
109
+ | American craftsman style | 0.698 | 52 |
110
+ | Georgian architecture | 0.634 | 53 |
111
+ | Beaux-Arts architecture | 0.650 | 61 |
112
+ | Colonial architecture | 0.610 | 68 |
113
+ | International style | 0.561 | 59 |
114
+ | Palladian architecture | 0.547 | 49 |
115
+ | Edwardian architecture | 0.526 | 41 |
116
+
117
+ ### Most-confused pairs
118
+
119
+ | True class | Predicted as | Confusion rate |
120
+ |---|---|---|
121
+ | International style | Bauhaus architecture | 27.1 % |
122
+ | Postmodern architecture | International style | 17.0 % |
123
+ | American craftsman style | American Foursquare | 15.4 % |
124
+ | Palladian architecture | Greek Revival architecture | 14.3 % |
125
+ | Byzantine architecture | Russian Revival architecture | 13.3 % |
126
+
127
+ ---
128
+
129
+ ## Intended use
130
+
131
+ - Classifying exterior building photographs by architectural style
132
+ - Educational and research use in architectural history and computer vision
133
+ - Input to downstream retrieval or recommendation systems
134
+
135
+ **Not intended for:**
136
+ - Interior photographs, architectural renders, or drawings
137
+ - Styles not in the 26-class vocabulary
138
+ - High-stakes decisions without human review
139
+
140
+ ---
141
+
142
+ ## Limitations
143
+
144
+ - **Weak classes:** Edwardian (F1 = 0.53), Palladian (0.55), and International style (0.56) are the least reliable; treat their predictions as soft signals
145
+ - **Style overlap:** International ↔ Bauhaus and Postmodern ↔ International confusions reflect genuine art-historical ambiguity, not purely model error
146
+ - **Geographic bias:** training data is heavily Western/European
147
+ - **Modality:** trained exclusively on exterior photographs; performance on interiors and non-photographic images is undefined
148
+ - **Leakage caveat:** Ancient Egyptian and Novelty classes contain multiple photographs of the same landmark buildings; their F1 scores are likely slightly optimistic
149
+
150
+ ---
151
+
152
+ ## Usage
153
+
154
+ ```python
155
+ import joblib
156
+ import torch
157
+ import torch.nn.functional as F
158
+ from PIL import Image
159
+ from transformers import CLIPModel, CLIPProcessor
160
+ from huggingface_hub import hf_hub_download
161
+
162
+ REPO_ID = "axel-riben/clip-arch-classifier"
163
+
164
+ # Load CLIP
165
+ processor = CLIPProcessor.from_pretrained("openai/clip-vit-base-patch32")
166
+ clip = CLIPModel.from_pretrained("openai/clip-vit-base-patch32").eval()
167
+
168
+ # Load classifier and calibrator
169
+ svc = joblib.load(hf_hub_download(REPO_ID, "linearsvc.joblib"))
170
+ platt = joblib.load(hf_hub_download(REPO_ID, "platt_calibrator.joblib"))
171
+
172
+ # Predict
173
+ image = Image.open("building.jpg").convert("RGB")
174
+ inputs = processor(images=image, return_tensors="pt")
175
+ with torch.no_grad():
176
+ feats = clip.get_image_features(**inputs)
177
+ if not isinstance(feats, torch.Tensor):
178
+ feats = feats.pooler_output
179
+ emb = F.normalize(feats, dim=-1).numpy()
180
+
181
+ scores = svc.decision_function(emb) # (1, 26)
182
+ probs = platt.predict_proba(scores)[0] # (26,)
183
+
184
+ top5 = sorted(zip(platt.classes_, probs), key=lambda x: -x[1])[:5]
185
+ for label, prob in top5:
186
+ print(f"{prob:.3f} {label}")
187
+ ```
188
+
189
+ ---
190
+
191
+ ## Citation
192
+
193
+ If you use this model, please also cite the original dataset:
194
+
195
+ ```
196
+ Danci, Marian Dumitru/dumitrux. (n.d.). Architectural Styles Dataset [Data set].
197
+ Kaggle. https://www.kaggle.com/datasets/dumitrux/architectural-styles-dataset
198
+ ```
199
+
200
+ ## License
201
+
202
+ Code and model weights: MIT.
203
+ Training data licences: see the [dataset card](https://huggingface.co/datasets/axel-riben/arcdataset-brutalism-extension).
label_encoder.joblib ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:71289bb21b66c5b853b20c8661676b8551e71695b96e530396738e5aa5f35319
3
+ size 3655
linearsvc.joblib ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:c30c19533232c45b6c3f26fd61c9e47a05dfb002e70acba2ec5f3e4217fa2df9
3
+ size 107635
platt_calibrator.joblib ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:98d98d19a48f093c693215404459510991030a31b3caaa54ed1ce0ac66d1664b
3
+ size 9767