| --- |
| license: apache-2.0 |
| datasets: |
| - behAIvNET/TinyAestheticNet |
| --- |
| |
| # TinyAestheticNet |
|
|
| A tiny aesthetic assessment model featuring only **16,581 trainable parameters** built on top of a frozen CLIP backbone, achieving an impressive **1.29 MAE** on a 1-10 scoring scale. |
|
|
| ## Data |
|
|
| To measure creativity in paintings, this study utilizes a dataset consisting of 878 open-access modern art paintings. Each artwork was rigorously evaluated by 3 fine arts experts specializing in painting. The experts reached a consensus to score the artworks on a scale of 1 to 10 across 5 distinct criteria: |
| * **K1:** Originality |
| * **K2:** Aesthetics |
| * **K3:** Design Principles and Elements |
| * **K4:** Technique Used |
| * **K5:** Unity and Wholeness |
|
|
| The dataset features a well-balanced distribution, with an almost equal representation of scores across the 1 to 10 scale for each criterion. |
|
|
| ## Model |
|
|
| ```bash |
| pip install torch torchvision git+https://github.com/openai/CLIP.git huggingface_hub |
| ``` |
|
|
| ```python |
| import torch |
| import torch.nn as nn |
| from PIL import Image |
| import clip |
| from huggingface_hub import hf_hub_download |
| |
| class CLIPMLPScorer(nn.Module): |
| def __init__(self, clip_model, num_factors=5, dropout=0.4): |
| super().__init__() |
| self.clip_model = clip_model |
| |
| for p in self.clip_model.parameters(): |
| p.requires_grad = False |
| |
| self.head = nn.Sequential( |
| nn.Linear(512, 32), |
| nn.ReLU(), |
| nn.Dropout(dropout), |
| nn.Linear(32, num_factors), |
| nn.Sigmoid(), |
| ) |
| |
| def forward(self, x): |
| features = self.clip_model.encode_image(x).float() |
| x = self.head(features) |
| x = x * 9 + 1 |
| return x |
| |
| device = torch.device("cuda" if torch.cuda.is_available() else "cpu") |
| clip_model, preprocess = clip.load("ViT-B/32", device=device) |
| |
| repo_id = "behAIvNET/TinyAestheticNet" |
| model_path = hf_hub_download(repo_id=repo_id, filename="TinyAestheticNet.pt") |
| |
| model = CLIPMLPScorer(clip_model).to(device) |
| model.load_state_dict(torch.load(model_path, map_location=device)) |
| model.eval() |
| |
| |
| image_path = "test_artwork.jpg" |
| img = preprocess(Image.open(image_path)).unsqueeze(0).to(device) |
| |
| with torch.no_grad(): |
| scores = model(img)[0].cpu().numpy() |
| |
| criteria = ["K1", "K2", "K3", "K4", "K5"] |
| print("Artwork Scores:") |
| for c, s in zip(criteria, scores): |
| print(f"{c}: {s:.2f}/10") |
| ``` |
|
|
| ## Performance |
|
|
|  |
|
|
| ## Explainability |
|
|
|  |