File size: 2,528 Bytes
727434b
 
 
 
 
 
 
 
92f300e
 
4455641
727434b
 
 
 
 
 
 
e4699ae
727434b
 
4455641
727434b
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
4455641
727434b
 
 
4455641
727434b
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
---
license: apache-2.0
datasets:
- behAIvNET/TinyAestheticNet
---

# TinyAestheticNet

A tiny aesthetic assessment model featuring only **16,581 trainable parameters** built on top of a frozen CLIP backbone, achieving an impressive **1.29 MAE** on a 1-10 scoring scale.

## Data

To measure creativity in paintings, this study utilizes a dataset consisting of 878 open-access modern art paintings. Each artwork was rigorously evaluated by 3 fine arts experts specializing in painting. The experts reached a consensus to score the artworks on a scale of 1 to 10 across 5 distinct criteria:
* **K1:** Originality
* **K2:** Aesthetics
* **K3:** Design Principles and Elements
* **K4:** Technique Used
* **K5:** Unity and Wholeness

The dataset features a well-balanced distribution, with an almost equal representation of scores across the 1 to 10 scale for each criterion.

## Model

```bash
pip install torch torchvision git+https://github.com/openai/CLIP.git huggingface_hub
```

```python
import torch
import torch.nn as nn
from PIL import Image
import clip
from huggingface_hub import hf_hub_download

class CLIPMLPScorer(nn.Module):
    def __init__(self, clip_model, num_factors=5, dropout=0.4):
        super().__init__()
        self.clip_model = clip_model
        
        for p in self.clip_model.parameters():
            p.requires_grad = False

        self.head = nn.Sequential(
            nn.Linear(512, 32),
            nn.ReLU(),
            nn.Dropout(dropout),
            nn.Linear(32, num_factors),
            nn.Sigmoid(),
        )

    def forward(self, x):
        features = self.clip_model.encode_image(x).float() 
        x = self.head(features)   
        x = x * 9 + 1             
        return x

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
clip_model, preprocess = clip.load("ViT-B/32", device=device)

repo_id = "behAIvNET/TinyAestheticNet"
model_path = hf_hub_download(repo_id=repo_id, filename="TinyAestheticNet.pt")

model = CLIPMLPScorer(clip_model).to(device)
model.load_state_dict(torch.load(model_path, map_location=device))
model.eval()


image_path = "test_artwork.jpg" 
img = preprocess(Image.open(image_path)).unsqueeze(0).to(device)

with torch.no_grad(): 
    scores = model(img)[0].cpu().numpy()

criteria = ["K1", "K2", "K3", "K4", "K5"]
print("Artwork Scores:")
for c, s in zip(criteria, scores):
    print(f"{c}: {s:.2f}/10")
```

## Performance

![Performance](TinyAestheticNet_performance.png)

## Explainability

![XAI](TinyAestheticNet_xai.png)