ConvNext-aesthetic-rater

Model Details

  • Architecture: animetimm/caformer_b36.dbv4-full
  • Classes: ['Good', 'Normal', 'Rest'] (3 classes)
  • Input size: ~448px (non-square, both sides divisible by 32)
  • Best val accuracy: 0.8684
  • Training precision: bf16 mixed precision

Training Config

Parameter Value
Batch size 16
Head LR 0.001
Fine-tune LR 0.0001
Weight decay 0.0001
Label smoothing 0.1
MixUp True
Scheduler CosineAnnealing

Usage

import timm
import torch

model = timm.create_model("animetimm/caformer_b36.dbv4-full", pretrained=False, num_classes=3)
checkpoint = torch.load("best_model.pth", map_location="cpu", weights_only=False)
model.load_state_dict(checkpoint["model_state_dict"])
model.eval()

Notes

  • Color is important for classification — augmentations preserve hue
  • Images are resized so both dimensions are divisible by 32 (non-square)
  • At inference, cap longest side to 640px and round both sides to nearest multiple of 32
Downloads last month
-
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Space using Shio-Koube/ConvNext-aesthetic-rater 1