EfficientNetV2-S Car Body Classifier

This repository contains a PyTorch/timm EfficientNetV2-S image classifier for 8 vehicle body types. The model predicts one of SUV, VAN, STATION_WAGON, MICRO, ACIK_TEKERLEKLI_F1_ARACLARI, SEDAN, HATCHBACK, or PICK_UP from a single RGB vehicle image.

The checkpoint is a fine-tuned tf_efficientnetv2_s model with an 8-class classification head. Probability outputs should be temperature-calibrated with temperature = 0.565484.

Files

File Purpose
efficientnetv2_s_best.pt PyTorch state dict checkpoint.
model.safetensors Safetensors copy of the same state dict for safer public download workflows.
config.json Repository-level model/task configuration.
metadata.json Training, class, artifact, and metric metadata.
preprocessor_config.json Image size, normalization, and preprocessing contract.
calibration.json Temperature scaling configuration and raw/calibrated metrics.
requirements.txt Minimal runtime dependencies.

Intended Use

Use this model for educational or prototype vehicle body-type classification from visible car images. It is intended for controlled project demos, non-commercial research evaluation, model-serving experiments, and API integration tests where inputs are ordinary car images.

Out-of-Scope Use

Do not use this model for commercial products, public hosted inference services, redistribution, or as the sole basis for safety-critical, insurance, legal, financial, or law-enforcement decisions. Do not assume robust performance on non-car vehicles, heavily occluded vehicles, unusual viewpoints, drawings, low-quality CCTV images, or domains that differ strongly from the training data.

Dataset Provenance

The training data was assembled from multiple public Kaggle and Hugging Face image collections plus manual mapping and QA. The project used source diversity to reduce overfitting to one capture style, but the upstream licensing chain is mixed and must be treated conservatively.

Source Platform URL Upstream license metadata observed Use in project
Cars Body Type Cropped Kaggle https://www.kaggle.com/datasets/ademboukhris/cars-body-type-cropped CC0-1.0; source text says Stanford dataset, DuckDuckGo, and Bing Direct body-style folders for SUV, VAN, SEDAN, HATCHBACK, PICK UP
Stanford Car Body Type Data Kaggle https://www.kaggle.com/datasets/mayurmahurkar/stanford-car-body-type-data other; derived from Stanford Cars train subset Stanford-derived body-style folders
Stanford Car Dataset by Classes Folder Kaggle https://www.kaggle.com/datasets/jutrera/stanford-car-dataset-by-classes-folder other; Stanford Cars repackaging Fine-grained Stanford Cars folders used for MICRO/VAN mappings
Vehicle Images Dataset Kaggle https://www.kaggle.com/datasets/lyensoetanto/vehicle-images-dataset unknown Supplemental SUV, VAN, SEDAN, PICK UP, MICRO candidates; weak labels excluded by default
Formula One Cars Kaggle https://www.kaggle.com/datasets/vesuvius13/formula-one-cars CC0-1.0; source text says DuckDuckGo scraping Open-wheel/F1 images
CBSC Hugging Face https://huggingface.co/datasets/kitrofimov/cbsc Dataset card states images from Bing, Baidu, DuckDuckGo, Google Images, and Reddit, with license compliance reviewed by dataset author Direct body-style classes
Stanford Cars Hugging Face https://huggingface.co/datasets/tanganke/stanford_cars Dataset card does not expose a clear license in fetched metadata Stanford-derived car images and labels
Automobiles Hugging Face https://huggingface.co/datasets/adityab99/Automobiles Dataset card says more information needed Formula 1 / normal car supplemental images

Because some upstream sources have other or unknown license metadata and some were collected through web search or community platforms, this repository uses a restricted research/evaluation license rather than an open-source or commercial-use license.

Classes

The model emits probabilities in this fixed order:

Index Label
0 SUV
1 VAN
2 STATION_WAGON
3 MICRO
4 ACIK_TEKERLEKLI_F1_ARACLARI
5 SEDAN
6 HATCHBACK
7 PICK_UP

Inference

Install dependencies:

pip install -r requirements.txt

Run single-image inference from a cloned repository:

from pathlib import Path

import timm
import torch
import torch.nn.functional as F
from PIL import Image
from torchvision import transforms

repo_dir = Path(".")
labels = [
    "SUV",
    "VAN",
    "STATION_WAGON",
    "MICRO",
    "ACIK_TEKERLEKLI_F1_ARACLARI",
    "SEDAN",
    "HATCHBACK",
    "PICK_UP",
]

model = timm.create_model("tf_efficientnetv2_s", pretrained=False, num_classes=len(labels))

# Prefer model.safetensors for public download workflows. The .pt checkpoint is
# retained for compatibility with the original project package.
try:
    from safetensors.torch import load_file

    state = load_file(repo_dir / "model.safetensors")
except ImportError:
    state = torch.load(repo_dir / "efficientnetv2_s_best.pt", map_location="cpu", weights_only=True)
if isinstance(state, dict) and "model_state_dict" in state:
    state = state["model_state_dict"]
model.load_state_dict(state)
model.eval()

preprocess = transforms.Compose(
    [
        transforms.Resize(224, interpolation=transforms.InterpolationMode.BICUBIC),
        transforms.CenterCrop(224),
        transforms.ToTensor(),
        transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]),
    ]
)

image = Image.open("example.jpg").convert("RGB")
inputs = preprocess(image).unsqueeze(0)

with torch.no_grad():
    logits = model(inputs)
    probs = F.softmax(logits / 0.565484, dim=1).squeeze(0)

pred_idx = int(probs.argmax().item())
print({"label": labels[pred_idx], "confidence": float(probs[pred_idx])})

Preprocessing

Validation, evaluation, and inference use the same deterministic preprocessing:

Step Value
Color mode RGB
Resize 224 px, bicubic
Crop 224 x 224 center crop
Normalization mean [0.485, 0.456, 0.406]
Normalization std [0.229, 0.224, 0.225]

Training used additional safe augmentations such as horizontal flip, mild random resized crop, rotation, and color jitter. Vertical flip was intentionally disabled because upside-down vehicles are not label-preserving.

Evaluation

The packaged metadata reports the following final calibrated model metrics:

Split Accuracy Macro F1 Weighted F1 NLL Brier ECE 15 bins
Validation, calibrated 0.970439 0.957292 0.970565 0.136917 0.049335 0.009176
Test, calibrated 0.966630 0.954043 0.966779 0.152494 0.055011 0.009874

Per-class validation F1 from training metadata:

Class Validation F1
SUV 0.972405
VAN 0.988434
STATION_WAGON 0.960352
MICRO 0.887417
ACIK_TEKERLEKLI_F1_ARACLARI 0.996005
SEDAN 0.976542
HATCHBACK 0.927184
PICK_UP 0.950000

The test metrics in metadata.json and calibration.json are the source of truth for this packaged model. An additional project-level evaluation file may contain a newer or separate raw evaluation run; do not mix those values without verifying the exact checkpoint and split.

Training Details

Field Value
Architecture tf_efficientnetv2_s
Base weights ImageNet pretrained
Fine-tuning Full fine-tuning
Classes 8
Seed 20260506
Optimizer AdamW
Learning rate 0.0003
Weight decay 0.0001
Batch size 32
Configured epochs 30
Best epoch 13
Label smoothing 0.05
Imbalance handling Class weights
Precision FP32

Limitations and Biases

Performance depends on the training data distribution and may degrade on unseen regions, camera setups, image styles, vehicle modifications, unusual body kits, partial vehicles, or ambiguous classes. MICRO and HATCHBACK are relatively harder classes in the reported metrics. The training dataset provenance and license must be reviewed before public or commercial release.

Safety and Ethics

The model classifies vehicle body type only. It should not be used to infer protected attributes, identify people, or make high-stakes decisions. Downstream applications should disclose uncertainty and provide human review for ambiguous images.

License

This repository is released under the restricted research and evaluation license in LICENSE.md. It is not open source and is not approved for commercial reuse, redistribution, public hosted inference, or production deployment. This conservative license is used because the upstream training data license chain contains mixed, unknown, and web-scraped sources.

Citation

No formal paper citation is available for this project. If you use this model, please cite the project repository and this model card:

Maintainers

  • Furkan Nezih Uzmez, Kocaeli University Computer Engineering
  • Yusuf Kerem Alcicek, Kocaeli University Computer Engineering
Downloads last month
20
Safetensors
Model size
20.3M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for ryan12345441/car-body-classifier

Finetuned
(9)
this model

Evaluation results