Instructions to use axel-riben/clip-arch-classifier with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Scikit-learn
How to use axel-riben/clip-arch-classifier with Scikit-learn:
from huggingface_hub import hf_hub_download import joblib model = joblib.load( hf_hub_download("axel-riben/clip-arch-classifier", "sklearn_model.joblib") ) # only load pickle files from sources you trust # read more about it here https://skops.readthedocs.io/en/stable/persistence.html - Notebooks
- Google Colab
- Kaggle
clip-arch-classifier
Architectural style image classifier built on frozen CLIP ViT-B/32 embeddings.
Classifies exterior building photographs into 26 architectural styles. The classifier is a LinearSVC fitted on 512-dim L2-normalised CLIP image embeddings, with a Platt calibrator (logistic regression) on top to produce interpretable probabilities.
Model description
| Component | Detail |
|---|---|
| Feature extractor | CLIP ViT-B/32 (openai/clip-vit-base-patch32) — frozen |
| Embedding dim | 512, L2-normalised |
| Classifier | sklearn.svm.LinearSVC (C=1, balanced class weights) |
| Calibration | Platt scaling — sklearn.linear_model.LogisticRegression fitted on val-set decision scores |
| Training date | 2026-05-08 |
| Random seed | 42 |
Files
| File | Description |
|---|---|
linearsvc.joblib |
Fitted LinearSVC |
label_encoder.joblib |
sklearn LabelEncoder (integer ↔ class name) |
platt_calibrator.joblib |
Platt calibrator — use this for predict_proba |
Training data
Trained on the Architectural Styles Dataset (Curated and Extended): 9,767 images across 26 classes, split 70/15/15 train/val/test (stratified, seed 42).
The 26 classes are: Achaemenid, American Craftsman, American Foursquare, Ancient Egyptian, Art Deco, Art Nouveau, Baroque, Bauhaus, Beaux-Arts, Brutalism, Byzantine, Chicago school, Colonial, Deconstructivism, Edwardian, Georgian, Gothic, Greek Revival, International style, Novelty, Palladian, Postmodern, Queen Anne, Romanesque, Russian Revival, Tudor Revival.
Evaluation
Test set: 1,489 images (held-out, never seen during training or calibration)
| Metric | Value |
|---|---|
| Top-1 accuracy | 0.7616 |
| Top-3 accuracy | 0.9261 |
| Top-5 accuracy | 0.9664 |
| Macro F1 | 0.7577 |
| Weighted F1 | 0.7582 |
Per-class F1 (test set)
| Class | F1 | Support |
|---|---|---|
| Ancient Egyptian architecture | 0.952 | 53 |
| Achaemenid architecture | 0.938 | 55 |
| Novelty architecture | 0.920 | 54 |
| Gothic architecture | 0.915 | 47 |
| Brutalism architecture | 0.867 | 44 |
| Deconstructivism | 0.872 | 44 |
| Russian Revival architecture | 0.844 | 49 |
| Chicago school architecture | 0.824 | 39 |
| Art Nouveau architecture | 0.813 | 90 |
| Romanesque architecture | 0.805 | 44 |
| Byzantine architecture | 0.795 | 45 |
| Queen Anne architecture | 0.793 | 107 |
| Greek Revival architecture | 0.776 | 76 |
| Tudor Revival architecture | 0.776 | 65 |
| Art Deco architecture | 0.764 | 83 |
| Baroque architecture | 0.740 | 66 |
| American Foursquare architecture | 0.732 | 53 |
| Postmodern architecture | 0.674 | 47 |
| Bauhaus architecture | 0.674 | 45 |
| American craftsman style | 0.698 | 52 |
| Georgian architecture | 0.634 | 53 |
| Beaux-Arts architecture | 0.650 | 61 |
| Colonial architecture | 0.610 | 68 |
| International style | 0.561 | 59 |
| Palladian architecture | 0.547 | 49 |
| Edwardian architecture | 0.526 | 41 |
Most-confused pairs
| True class | Predicted as | Confusion rate |
|---|---|---|
| International style | Bauhaus architecture | 27.1 % |
| Postmodern architecture | International style | 17.0 % |
| American craftsman style | American Foursquare | 15.4 % |
| Palladian architecture | Greek Revival architecture | 14.3 % |
| Byzantine architecture | Russian Revival architecture | 13.3 % |
Intended use
- Classifying exterior building photographs by architectural style
- Educational and research use in architectural history and computer vision
- Input to downstream retrieval or recommendation systems
Not intended for:
- Interior photographs, architectural renders, or drawings
- Styles not in the 26-class vocabulary
- High-stakes decisions without human review
Limitations
- Weak classes: Edwardian (F1 = 0.53), Palladian (0.55), and International style (0.56) are the least reliable; treat their predictions as soft signals
- Style overlap: International ↔ Bauhaus and Postmodern ↔ International confusions reflect genuine art-historical ambiguity, not purely model error
- Geographic bias: training data is heavily Western/European
- Modality: trained exclusively on exterior photographs; performance on interiors and non-photographic images is undefined
- Leakage caveat: Ancient Egyptian and Novelty classes contain multiple photographs of the same landmark buildings; their F1 scores are likely slightly optimistic
Usage
import joblib
import torch
import torch.nn.functional as F
from PIL import Image
from transformers import CLIPModel, CLIPProcessor
from huggingface_hub import hf_hub_download
REPO_ID = "axel-riben/clip-arch-classifier"
# Load CLIP
processor = CLIPProcessor.from_pretrained("openai/clip-vit-base-patch32")
clip = CLIPModel.from_pretrained("openai/clip-vit-base-patch32").eval()
# Load classifier and calibrator
svc = joblib.load(hf_hub_download(REPO_ID, "linearsvc.joblib"))
platt = joblib.load(hf_hub_download(REPO_ID, "platt_calibrator.joblib"))
# Predict
image = Image.open("building.jpg").convert("RGB")
inputs = processor(images=image, return_tensors="pt")
with torch.no_grad():
feats = clip.get_image_features(**inputs)
if not isinstance(feats, torch.Tensor):
feats = feats.pooler_output
emb = F.normalize(feats, dim=-1).numpy()
scores = svc.decision_function(emb) # (1, 26)
probs = platt.predict_proba(scores)[0] # (26,)
top5 = sorted(zip(platt.classes_, probs), key=lambda x: -x[1])[:5]
for label, prob in top5:
print(f"{prob:.3f} {label}")
Citation
If you use this model, please also cite the original dataset:
Danci, Marian Dumitru/dumitrux. (n.d.). Architectural Styles Dataset [Data set].
Kaggle. https://www.kaggle.com/datasets/dumitrux/architectural-styles-dataset
License
Code and model weights: MIT. Training data licences: see the dataset card.
- Downloads last month
- -
Model tree for axel-riben/clip-arch-classifier
Base model
openai/clip-vit-base-patch32Dataset used to train axel-riben/clip-arch-classifier
Space using axel-riben/clip-arch-classifier 1
Evaluation results
- Top-1 Accuracy on Architectural Styles Dataset (Curated and Extended)self-reported0.762
- Top-3 Accuracy on Architectural Styles Dataset (Curated and Extended)self-reported0.926
- Macro F1 on Architectural Styles Dataset (Curated and Extended)self-reported0.758