EfficientNetV2-S Car Body Classifier

This repository contains a PyTorch/timm EfficientNetV2-S image classifier for 8 vehicle body types. The model predicts one of SUV, VAN, STATION_WAGON, MICRO, ACIK_TEKERLEKLI_F1_ARACLARI, SEDAN, HATCHBACK, or PICK_UP from a single RGB vehicle image.

The checkpoint is a fine-tuned tf_efficientnetv2_s model with an 8-class classification head. Probability outputs should be temperature-calibrated with temperature = 0.565484.

Files

File	Purpose
`efficientnetv2_s_best.pt`	PyTorch state dict checkpoint.
`model.safetensors`	Safetensors copy of the same state dict for safer public download workflows.
`config.json`	Repository-level model/task configuration.
`metadata.json`	Training, class, artifact, and metric metadata.
`preprocessor_config.json`	Image size, normalization, and preprocessing contract.
`calibration.json`	Temperature scaling configuration and raw/calibrated metrics.
`requirements.txt`	Minimal runtime dependencies.

Intended Use

Use this model for educational or prototype vehicle body-type classification from visible car images. It is intended for controlled project demos, non-commercial research evaluation, model-serving experiments, and API integration tests where inputs are ordinary car images.

Out-of-Scope Use

Do not use this model for commercial products, public hosted inference services, redistribution, or as the sole basis for safety-critical, insurance, legal, financial, or law-enforcement decisions. Do not assume robust performance on non-car vehicles, heavily occluded vehicles, unusual viewpoints, drawings, low-quality CCTV images, or domains that differ strongly from the training data.

Dataset Provenance

The training data was assembled from multiple public Kaggle and Hugging Face image collections plus manual mapping and QA. The project used source diversity to reduce overfitting to one capture style, but the upstream licensing chain is mixed and must be treated conservatively.

Source	Platform	URL	Upstream license metadata observed	Use in project
Cars Body Type Cropped	Kaggle	https://www.kaggle.com/datasets/ademboukhris/cars-body-type-cropped	`CC0-1.0`; source text says Stanford dataset, DuckDuckGo, and Bing	Direct body-style folders for SUV, VAN, SEDAN, HATCHBACK, PICK UP
Stanford Car Body Type Data	Kaggle	https://www.kaggle.com/datasets/mayurmahurkar/stanford-car-body-type-data	`other`; derived from Stanford Cars train subset	Stanford-derived body-style folders
Stanford Car Dataset by Classes Folder	Kaggle	https://www.kaggle.com/datasets/jutrera/stanford-car-dataset-by-classes-folder	`other`; Stanford Cars repackaging	Fine-grained Stanford Cars folders used for MICRO/VAN mappings
Vehicle Images Dataset	Kaggle	https://www.kaggle.com/datasets/lyensoetanto/vehicle-images-dataset	`unknown`	Supplemental SUV, VAN, SEDAN, PICK UP, MICRO candidates; weak labels excluded by default
Formula One Cars	Kaggle	https://www.kaggle.com/datasets/vesuvius13/formula-one-cars	`CC0-1.0`; source text says DuckDuckGo scraping	Open-wheel/F1 images
CBSC	Hugging Face	https://huggingface.co/datasets/kitrofimov/cbsc	Dataset card states images from Bing, Baidu, DuckDuckGo, Google Images, and Reddit, with license compliance reviewed by dataset author	Direct body-style classes
Stanford Cars	Hugging Face	https://huggingface.co/datasets/tanganke/stanford_cars	Dataset card does not expose a clear license in fetched metadata	Stanford-derived car images and labels
Automobiles	Hugging Face	https://huggingface.co/datasets/adityab99/Automobiles	Dataset card says more information needed	Formula 1 / normal car supplemental images

Because some upstream sources have other or unknown license metadata and some were collected through web search or community platforms, this repository uses a restricted research/evaluation license rather than an open-source or commercial-use license.

Classes

The model emits probabilities in this fixed order:

Index	Label
0	`SUV`
1	`VAN`
2	`STATION_WAGON`
3	`MICRO`
4	`ACIK_TEKERLEKLI_F1_ARACLARI`
5	`SEDAN`
6	`HATCHBACK`
7	`PICK_UP`

Inference

Install dependencies:

pip install -r requirements.txt

Run single-image inference from a cloned repository:

from pathlib import Path

import timm
import torch
import torch.nn.functional as F
from PIL import Image
from torchvision import transforms

repo_dir = Path(".")
labels = [
    "SUV",
    "VAN",
    "STATION_WAGON",
    "MICRO",
    "ACIK_TEKERLEKLI_F1_ARACLARI",
    "SEDAN",
    "HATCHBACK",
    "PICK_UP",
]

model = timm.create_model("tf_efficientnetv2_s", pretrained=False, num_classes=len(labels))

# Prefer model.safetensors for public download workflows. The .pt checkpoint is
# retained for compatibility with the original project package.
try:
    from safetensors.torch import load_file

    state = load_file(repo_dir / "model.safetensors")
except ImportError:
    state = torch.load(repo_dir / "efficientnetv2_s_best.pt", map_location="cpu", weights_only=True)
if isinstance(state, dict) and "model_state_dict" in state:
    state = state["model_state_dict"]
model.load_state_dict(state)
model.eval()

preprocess = transforms.Compose(
    [
        transforms.Resize(224, interpolation=transforms.InterpolationMode.BICUBIC),
        transforms.CenterCrop(224),
        transforms.ToTensor(),
        transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]),
    ]
)

image = Image.open("example.jpg").convert("RGB")
inputs = preprocess(image).unsqueeze(0)

with torch.no_grad():
    logits = model(inputs)
    probs = F.softmax(logits / 0.565484, dim=1).squeeze(0)

pred_idx = int(probs.argmax().item())
print({"label": labels[pred_idx], "confidence": float(probs[pred_idx])})

Preprocessing

Validation, evaluation, and inference use the same deterministic preprocessing:

Step	Value
Color mode	RGB
Resize	224 px, bicubic
Crop	224 x 224 center crop
Normalization mean	`[0.485, 0.456, 0.406]`
Normalization std	`[0.229, 0.224, 0.225]`

Training used additional safe augmentations such as horizontal flip, mild random resized crop, rotation, and color jitter. Vertical flip was intentionally disabled because upside-down vehicles are not label-preserving.

Evaluation

The packaged metadata reports the following final calibrated model metrics:

Split	Accuracy	Macro F1	Weighted F1	NLL	Brier	ECE 15 bins
Validation, calibrated	0.970439	0.957292	0.970565	0.136917	0.049335	0.009176
Test, calibrated	0.966630	0.954043	0.966779	0.152494	0.055011	0.009874

Per-class validation F1 from training metadata:

Class	Validation F1
`SUV`	0.972405
`VAN`	0.988434
`STATION_WAGON`	0.960352
`MICRO`	0.887417
`ACIK_TEKERLEKLI_F1_ARACLARI`	0.996005
`SEDAN`	0.976542
`HATCHBACK`	0.927184
`PICK_UP`	0.950000

The test metrics in metadata.json and calibration.json are the source of truth for this packaged model. An additional project-level evaluation file may contain a newer or separate raw evaluation run; do not mix those values without verifying the exact checkpoint and split.

Training Details

Field	Value
Architecture	`tf_efficientnetv2_s`
Base weights	ImageNet pretrained
Fine-tuning	Full fine-tuning
Classes	8
Seed	20260506
Optimizer	AdamW
Learning rate	0.0003
Weight decay	0.0001
Batch size	32
Configured epochs	30
Best epoch	13
Label smoothing	0.05
Imbalance handling	Class weights
Precision	FP32

Limitations and Biases

Performance depends on the training data distribution and may degrade on unseen regions, camera setups, image styles, vehicle modifications, unusual body kits, partial vehicles, or ambiguous classes. MICRO and HATCHBACK are relatively harder classes in the reported metrics. The training dataset provenance and license must be reviewed before public or commercial release.

Safety and Ethics

The model classifies vehicle body type only. It should not be used to infer protected attributes, identify people, or make high-stakes decisions. Downstream applications should disclose uncertainty and provide human review for ambiguous images.

License

This repository is released under the restricted research and evaluation license in LICENSE.md. It is not open source and is not approved for commercial reuse, redistribution, public hosted inference, or production deployment. This conservative license is used because the upstream training data license chain contains mixed, unknown, and web-scraped sources.

Citation

No formal paper citation is available for this project. If you use this model, please cite the project repository and this model card:

Project repository: https://github.com/furkan-uzmez/Yazlab6
Model card: https://huggingface.co/ryan12345441/car-body-classifier

Maintainers

Furkan Nezih Uzmez, Kocaeli University Computer Engineering
Yusuf Kerem Alcicek, Kocaeli University Computer Engineering

Downloads last month: 7

Safetensors

Model size

20.3M params

Tensor type

F32

Model tree for ryan12345441/car-body-classifier

Base model

timm/tf_efficientnetv2_s.in21k_ft_in1k

Finetuned

(10)

this model

Evaluation results

Test Accuracy on Custom car body image dataset
test set self-reported

0.967
Test Macro F1 on Custom car body image dataset
test set self-reported

0.954
Test Weighted F1 on Custom car body image dataset
test set self-reported

0.967