Image Classification
PyTorch
timm
computer-vision
vehicle-classification
fine-grained-classification
Instructions to use twincar-group2/twincar-classifier with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- timm
How to use twincar-group2/twincar-classifier with timm:
import timm model = timm.create_model("hf_hub:twincar-group2/twincar-classifier", pretrained=True) - Notebooks
- Google Colab
- Kaggle
File size: 7,828 Bytes
98c2203 db1d254 4ff2310 db1d254 4ff2310 db1d254 98c2203 db1d254 4ff2310 db1d254 4ff2310 db1d254 4ff2310 db1d254 4ff2310 db1d254 4ff2310 db1d254 4ff2310 db1d254 4ff2310 db1d254 4ff2310 db1d254 4ff2310 db1d254 4ff2310 e7877a0 4ff2310 e7877a0 4ff2310 e7877a0 4ff2310 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 | ---
library_name: pytorch
pipeline_tag: image-classification
tags:
- image-classification
- computer-vision
- vehicle-classification
- fine-grained-classification
- pytorch
- timm
license: mit
---
# TwinCar Classifier
TwinCar is a vehicle make, model, and auxiliary year recognition project developed for the Brainster Data Science Academy Machine Learning Final Project.
The final deployed model is an EfficientNet-B3 classifier fine-tuned on Stanford Cars. It predicts one of 196 fine-grained Stanford Cars classes, then derives vehicle make, model, and year from the predicted class metadata.
## Final deployed checkpoint
- **Checkpoint file:** `efficientnet_b3_stanford300_augv2_best.pt`
- **Architecture:** EfficientNet-B3
- **Input size:** 300 px
- **Classes:** 196 Stanford Cars fine-grained classes
- **Training data:** Stanford Cars training split
- **Training augmentation:** augmentation v2
- **Framework:** PyTorch + timm
- **Checkpoint manifest:** `checkpoint_manifest.json`
- **Current status:** final deployed candidate
The model repo also keeps older checkpoints for comparison and rollback:
- `efficientnet_b3_stanford300_best.pt` β previous EfficientNet-B3 checkpoint
- `best.pt` β older ResNet18 baseline checkpoint
## What the model predicts
The model directly predicts a fine-grained class, for example:
```text
Dodge Charger SRT-8 2009
```
From that fine-grained prediction, the system derives:
- **make** β e.g. `Dodge`
- **model** β e.g. `Charger SRT-8`
- **year** β e.g. `2009`
Year is included as an auxiliary output, but it is not predicted by a separate year-regression or year-classification head. It is derived from the predicted fine-grained class metadata.
## Validation results
Final quantitative comparison is reported on the locked Stanford validation split using the same evaluation protocol for all compared models.
| Transform | Fine acc | Make acc | Model acc | Year acc | Top-3 acc | Top-5 acc |
| ---------------- | -------: | -------: | --------: | -------: | --------: | --------: |
| clean | 0.7864 | 0.8692 | 0.7925 | 0.8913 | 0.9196 | 0.9521 |
| robust_light | 0.7882 | 0.8680 | 0.7944 | 0.8956 | 0.9159 | 0.9490 |
| robust_hard | 0.6839 | 0.7778 | 0.6900 | 0.8355 | 0.8600 | 0.9055 |
| robust_occlusion | 0.6317 | 0.7317 | 0.6366 | 0.8048 | 0.8060 | 0.8600 |
Compared with the earlier EfficientNet-B3 candidate (no augmentation v2), this model improves clean fine accuracy by +1.6 pts (0.770 β 0.786) and
robustness substantially β robust_hard +14 pts (0.543 β 0.684) and robust_occlusion +13 pts (0.502 β 0.632). Full comparison: the GitHub experiment report.
## Robustness evaluation
The final model was evaluated under multiple image transforms:
- **clean** β standard validation preprocessing
- **robust_light** β mild production-like perturbations
- **robust_hard** β stronger blur, lighting, color, and geometric perturbations
- **robust_occlusion** β synthetic occlusion/erasing stress test
These tests are not a replacement for real-world field validation, but they quantify how the model behaves under controlled distribution shifts.
## CompCars status
CompCars was inspected and used for external validation and reconnaissance, but it was not blindly merged into final training.
The main reason is that Stanford Cars and CompCars have a significant domain and label-distribution gap:
- Stanford Cars is a clean fine-grained benchmark with 196 make/model/year classes.
- CompCars contains different image domains and different make/model taxonomies.
- Exact Stanford Cars β CompCars make/model/year overlap was too small and biased for safe blind merging.
- Make-level external validation on CompCars dropped to ~28% (vs ~0.87 make accuracy in-domain), confirming a large cross-domain shift.
This confirmed that CompCars integration is a domain adaptation problem, not a simple data-merge task.
Future work should build a verified Stanford Cars β CompCars alias map, train on a controlled filtered subset, and validate on a true cross-domain holdout.
## Held-out Stanford test status
The local Stanford Cars test images were available and were used for qualitative API/demo smoke testing.
However, the available `cars_test_annos.mat` file contained only bounding boxes and filenames:
```text
bbox_x1
bbox_y1
bbox_x2
bbox_y2
fname
```
It did not include class labels.
The provided Kaggle mirror, `eduardo4jesus/stanford-cars-dataset`, was also checked. It included:
```text
cars_meta.mat
cars_train_annos.mat
cars_test_annos.mat
```
but did not provide:
```text
cars_test_annos_withlabels.mat
cars_annos.mat
```
Final quantitative model comparison is therefore reported on the locked Stanford validation split using an identical protocol and seed for every compared model. This keeps model-to-model deltas valid. A labeled held-out Stanford test evaluation would be a straightforward extension if `cars_test_annos_withlabels.mat` is obtained.
## Demo
The model is deployed in a Hugging Face Space:
- **Demo Space:** [https://huggingface.co/spaces/twincar-group2/twincar-demo](https://huggingface.co/spaces/twincar-group2/twincar-demo)
The Space supports:
- image upload
- webcam/clipboard input
- full-image prediction
- make/model/year display
- top-k predictions
- optional YOLO crop comparison mode
The YOLO cropper is experimental and default-off. The official prediction remains the full-image EfficientNet-B3 prediction.
## API and code
Project repository:
- **GitHub:** [https://github.com/Hristijan-kiko/twincar](https://github.com/Hristijan-kiko/twincar)
The GitHub repo includes:
- reusable Python package
- FastAPI inference endpoint
- Gradio demo app
- batch prediction script
- training and evaluation scripts
- robust evaluation reports
- CI with linting and tests
## W&B
Training and experiment tracking:
- **W&B project:** [https://wandb.ai/hzlatevskii-brainster-data-science-academy/twincar](https://wandb.ai/hzlatevskii-brainster-data-science-academy/twincar)
## Intended use
This model is intended for educational and prototype-level vehicle recognition experiments, especially make/model classification from car images similar to Stanford Cars.
Appropriate uses:
- fine-grained vehicle make/model recognition demo
- model comparison and robustness analysis
- prototype vehicle inspection workflow
- academic/academy project demonstration
Data note: weights are trained on the Stanford Cars dataset (research/educational use); the MIT license covers the project code.
Use of the weights should respect the Stanford Cars dataset terms.
## Limitations
- The final model is trained on Stanford Cars, not on real drone/robot production footage.
- CompCars showed a strong domain gap and was not blindly merged into final training.
- True top-down drone views remain out-of-distribution.
- Robustness tests use controlled synthetic perturbations, not full real-world field validation.
- Year is derived from the fine-grained class label metadata, not learned as an independent year model.
- The system assumes the uploaded image contains a vehicle.
- Strong non-car/out-of-distribution rejection is not implemented yet.
- Similar models and years can be confused because fine-grained vehicle classification often depends on subtle visual details.
## Future work
- Build a verified Stanford Cars β CompCars alias map.
- Fine-tune on a controlled CompCars surveillance subset.
- Add real production/drone/robot images for target-domain validation.
- Add non-car/out-of-distribution rejection.
- Add detector-assisted preprocessing as a validated default only if it improves real metrics.
- Explore multi-head prediction for make/model/year if independent outputs become necessary. |