TinyAestheticNet / README.md
goerkemsaylam's picture
Update README.md
92f300e verified
metadata
license: apache-2.0
datasets:
  - behAIvNET/TinyAestheticNet

TinyAestheticNet

A tiny aesthetic assessment model featuring only 16,581 trainable parameters built on top of a frozen CLIP backbone, achieving an impressive 1.29 MAE on a 1-10 scoring scale.

Data

To measure creativity in paintings, this study utilizes a dataset consisting of 878 open-access modern art paintings. Each artwork was rigorously evaluated by 3 fine arts experts specializing in painting. The experts reached a consensus to score the artworks on a scale of 1 to 10 across 5 distinct criteria:

  • K1: Originality
  • K2: Aesthetics
  • K3: Design Principles and Elements
  • K4: Technique Used
  • K5: Unity and Wholeness

The dataset features a well-balanced distribution, with an almost equal representation of scores across the 1 to 10 scale for each criterion.

Model

pip install torch torchvision git+https://github.com/openai/CLIP.git huggingface_hub
import torch
import torch.nn as nn
from PIL import Image
import clip
from huggingface_hub import hf_hub_download

class CLIPMLPScorer(nn.Module):
    def __init__(self, clip_model, num_factors=5, dropout=0.4):
        super().__init__()
        self.clip_model = clip_model
        
        for p in self.clip_model.parameters():
            p.requires_grad = False

        self.head = nn.Sequential(
            nn.Linear(512, 32),
            nn.ReLU(),
            nn.Dropout(dropout),
            nn.Linear(32, num_factors),
            nn.Sigmoid(),
        )

    def forward(self, x):
        features = self.clip_model.encode_image(x).float() 
        x = self.head(features)   
        x = x * 9 + 1             
        return x

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
clip_model, preprocess = clip.load("ViT-B/32", device=device)

repo_id = "behAIvNET/TinyAestheticNet"
model_path = hf_hub_download(repo_id=repo_id, filename="TinyAestheticNet.pt")

model = CLIPMLPScorer(clip_model).to(device)
model.load_state_dict(torch.load(model_path, map_location=device))
model.eval()


image_path = "test_artwork.jpg" 
img = preprocess(Image.open(image_path)).unsqueeze(0).to(device)

with torch.no_grad(): 
    scores = model(img)[0].cpu().numpy()

criteria = ["K1", "K2", "K3", "K4", "K5"]
print("Artwork Scores:")
for c, s in zip(criteria, scores):
    print(f"{c}: {s:.2f}/10")

Performance

Performance

Explainability

XAI