Instructions to use ryan12345441/car-body-classifier with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- timm
How to use ryan12345441/car-body-classifier with timm:
import timm model = timm.create_model("hf_hub:ryan12345441/car-body-classifier", pretrained=True) - Notebooks
- Google Colab
- Kaggle
EfficientNetV2-S Car Body Classifier
This repository contains a PyTorch/timm EfficientNetV2-S image classifier for 8 vehicle body types. The model predicts one of SUV, VAN, STATION_WAGON, MICRO, ACIK_TEKERLEKLI_F1_ARACLARI, SEDAN, HATCHBACK, or PICK_UP from a single RGB vehicle image.
The checkpoint is a fine-tuned tf_efficientnetv2_s model with an 8-class classification head. Probability outputs should be temperature-calibrated with temperature = 0.565484.
Files
| File | Purpose |
|---|---|
efficientnetv2_s_best.pt |
PyTorch state dict checkpoint. |
model.safetensors |
Safetensors copy of the same state dict for safer public download workflows. |
config.json |
Repository-level model/task configuration. |
metadata.json |
Training, class, artifact, and metric metadata. |
preprocessor_config.json |
Image size, normalization, and preprocessing contract. |
calibration.json |
Temperature scaling configuration and raw/calibrated metrics. |
requirements.txt |
Minimal runtime dependencies. |
Intended Use
Use this model for educational or prototype vehicle body-type classification from visible car images. It is intended for controlled project demos, non-commercial research evaluation, model-serving experiments, and API integration tests where inputs are ordinary car images.
Out-of-Scope Use
Do not use this model for commercial products, public hosted inference services, redistribution, or as the sole basis for safety-critical, insurance, legal, financial, or law-enforcement decisions. Do not assume robust performance on non-car vehicles, heavily occluded vehicles, unusual viewpoints, drawings, low-quality CCTV images, or domains that differ strongly from the training data.
Dataset Provenance
The training data was assembled from multiple public Kaggle and Hugging Face image collections plus manual mapping and QA. The project used source diversity to reduce overfitting to one capture style, but the upstream licensing chain is mixed and must be treated conservatively.
| Source | Platform | URL | Upstream license metadata observed | Use in project |
|---|---|---|---|---|
| Cars Body Type Cropped | Kaggle | https://www.kaggle.com/datasets/ademboukhris/cars-body-type-cropped | CC0-1.0; source text says Stanford dataset, DuckDuckGo, and Bing |
Direct body-style folders for SUV, VAN, SEDAN, HATCHBACK, PICK UP |
| Stanford Car Body Type Data | Kaggle | https://www.kaggle.com/datasets/mayurmahurkar/stanford-car-body-type-data | other; derived from Stanford Cars train subset |
Stanford-derived body-style folders |
| Stanford Car Dataset by Classes Folder | Kaggle | https://www.kaggle.com/datasets/jutrera/stanford-car-dataset-by-classes-folder | other; Stanford Cars repackaging |
Fine-grained Stanford Cars folders used for MICRO/VAN mappings |
| Vehicle Images Dataset | Kaggle | https://www.kaggle.com/datasets/lyensoetanto/vehicle-images-dataset | unknown |
Supplemental SUV, VAN, SEDAN, PICK UP, MICRO candidates; weak labels excluded by default |
| Formula One Cars | Kaggle | https://www.kaggle.com/datasets/vesuvius13/formula-one-cars | CC0-1.0; source text says DuckDuckGo scraping |
Open-wheel/F1 images |
| CBSC | Hugging Face | https://huggingface.co/datasets/kitrofimov/cbsc | Dataset card states images from Bing, Baidu, DuckDuckGo, Google Images, and Reddit, with license compliance reviewed by dataset author | Direct body-style classes |
| Stanford Cars | Hugging Face | https://huggingface.co/datasets/tanganke/stanford_cars | Dataset card does not expose a clear license in fetched metadata | Stanford-derived car images and labels |
| Automobiles | Hugging Face | https://huggingface.co/datasets/adityab99/Automobiles | Dataset card says more information needed | Formula 1 / normal car supplemental images |
Because some upstream sources have other or unknown license metadata and some were collected through web search or community platforms, this repository uses a restricted research/evaluation license rather than an open-source or commercial-use license.
Classes
The model emits probabilities in this fixed order:
| Index | Label |
|---|---|
| 0 | SUV |
| 1 | VAN |
| 2 | STATION_WAGON |
| 3 | MICRO |
| 4 | ACIK_TEKERLEKLI_F1_ARACLARI |
| 5 | SEDAN |
| 6 | HATCHBACK |
| 7 | PICK_UP |
Inference
Install dependencies:
pip install -r requirements.txt
Run single-image inference from a cloned repository:
from pathlib import Path
import timm
import torch
import torch.nn.functional as F
from PIL import Image
from torchvision import transforms
repo_dir = Path(".")
labels = [
"SUV",
"VAN",
"STATION_WAGON",
"MICRO",
"ACIK_TEKERLEKLI_F1_ARACLARI",
"SEDAN",
"HATCHBACK",
"PICK_UP",
]
model = timm.create_model("tf_efficientnetv2_s", pretrained=False, num_classes=len(labels))
# Prefer model.safetensors for public download workflows. The .pt checkpoint is
# retained for compatibility with the original project package.
try:
from safetensors.torch import load_file
state = load_file(repo_dir / "model.safetensors")
except ImportError:
state = torch.load(repo_dir / "efficientnetv2_s_best.pt", map_location="cpu", weights_only=True)
if isinstance(state, dict) and "model_state_dict" in state:
state = state["model_state_dict"]
model.load_state_dict(state)
model.eval()
preprocess = transforms.Compose(
[
transforms.Resize(224, interpolation=transforms.InterpolationMode.BICUBIC),
transforms.CenterCrop(224),
transforms.ToTensor(),
transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]),
]
)
image = Image.open("example.jpg").convert("RGB")
inputs = preprocess(image).unsqueeze(0)
with torch.no_grad():
logits = model(inputs)
probs = F.softmax(logits / 0.565484, dim=1).squeeze(0)
pred_idx = int(probs.argmax().item())
print({"label": labels[pred_idx], "confidence": float(probs[pred_idx])})
Preprocessing
Validation, evaluation, and inference use the same deterministic preprocessing:
| Step | Value |
|---|---|
| Color mode | RGB |
| Resize | 224 px, bicubic |
| Crop | 224 x 224 center crop |
| Normalization mean | [0.485, 0.456, 0.406] |
| Normalization std | [0.229, 0.224, 0.225] |
Training used additional safe augmentations such as horizontal flip, mild random resized crop, rotation, and color jitter. Vertical flip was intentionally disabled because upside-down vehicles are not label-preserving.
Evaluation
The packaged metadata reports the following final calibrated model metrics:
| Split | Accuracy | Macro F1 | Weighted F1 | NLL | Brier | ECE 15 bins |
|---|---|---|---|---|---|---|
| Validation, calibrated | 0.970439 | 0.957292 | 0.970565 | 0.136917 | 0.049335 | 0.009176 |
| Test, calibrated | 0.966630 | 0.954043 | 0.966779 | 0.152494 | 0.055011 | 0.009874 |
Per-class validation F1 from training metadata:
| Class | Validation F1 |
|---|---|
SUV |
0.972405 |
VAN |
0.988434 |
STATION_WAGON |
0.960352 |
MICRO |
0.887417 |
ACIK_TEKERLEKLI_F1_ARACLARI |
0.996005 |
SEDAN |
0.976542 |
HATCHBACK |
0.927184 |
PICK_UP |
0.950000 |
The test metrics in metadata.json and calibration.json are the source of truth for this packaged model. An additional project-level evaluation file may contain a newer or separate raw evaluation run; do not mix those values without verifying the exact checkpoint and split.
Training Details
| Field | Value |
|---|---|
| Architecture | tf_efficientnetv2_s |
| Base weights | ImageNet pretrained |
| Fine-tuning | Full fine-tuning |
| Classes | 8 |
| Seed | 20260506 |
| Optimizer | AdamW |
| Learning rate | 0.0003 |
| Weight decay | 0.0001 |
| Batch size | 32 |
| Configured epochs | 30 |
| Best epoch | 13 |
| Label smoothing | 0.05 |
| Imbalance handling | Class weights |
| Precision | FP32 |
Limitations and Biases
Performance depends on the training data distribution and may degrade on unseen regions, camera setups, image styles, vehicle modifications, unusual body kits, partial vehicles, or ambiguous classes. MICRO and HATCHBACK are relatively harder classes in the reported metrics. The training dataset provenance and license must be reviewed before public or commercial release.
Safety and Ethics
The model classifies vehicle body type only. It should not be used to infer protected attributes, identify people, or make high-stakes decisions. Downstream applications should disclose uncertainty and provide human review for ambiguous images.
License
This repository is released under the restricted research and evaluation license in LICENSE.md. It is not open source and is not approved for commercial reuse, redistribution, public hosted inference, or production deployment. This conservative license is used because the upstream training data license chain contains mixed, unknown, and web-scraped sources.
Citation
No formal paper citation is available for this project. If you use this model, please cite the project repository and this model card:
- Project repository: https://github.com/furkan-uzmez/Yazlab6
- Model card: https://huggingface.co/ryan12345441/car-body-classifier
Maintainers
- Furkan Nezih Uzmez, Kocaeli University Computer Engineering
- Yusuf Kerem Alcicek, Kocaeli University Computer Engineering
- Downloads last month
- 20
Model tree for ryan12345441/car-body-classifier
Base model
timm/tf_efficientnetv2_s.in21k_ft_in1kEvaluation results
- Test Accuracy on Custom car body image datasettest set self-reported0.967
- Test Macro F1 on Custom car body image datasettest set self-reported0.954
- Test Weighted F1 on Custom car body image datasettest set self-reported0.967