twincar-group2
/

twincar-classifier

@@ -1,9 +1,11 @@
 ---
 library_name: pytorch
 tags:
 - image-classification
 - computer-vision
 - vehicle-classification
 - pytorch
 - timm
 license: mit
@@ -11,32 +13,181 @@ license: mit
 # TwinCar Classifier
-TwinCar is a vehicle make and model recognition project developed for the Brainster Data Science Academy Machine Learning Final Project.
-## Task
-The model predicts:
-- vehicle make
-- vehicle model
-- optionally production year
-## Current status
-This repository is a placeholder for the final trained model and model card.
-Final contents will include:
-- model architecture
-- training dataset summary
-- validation metrics
-- robustness evaluation notes
-- usage instructions
-- limitations
-- links to GitHub, W&B, and demo Space
-## Links
-- GitHub repo: https://github.com/Hristijan-kiko/twincar
-- W&B project: https://wandb.ai/hzlatevskii-brainster-data-science-academy/twincar
-- Demo Space: https://huggingface.co/spaces/twincar-group2/twincar-demo

 ---
 library_name: pytorch
+pipeline_tag: image-classification
 tags:
 - image-classification
 - computer-vision
 - vehicle-classification
+- fine-grained-classification
 - pytorch
 - timm
 license: mit
 # TwinCar Classifier
+TwinCar is a vehicle make, model, and auxiliary year recognition project developed for the Brainster Data Science Academy Machine Learning Final Project.
+The final deployed model is an EfficientNet-B3 classifier fine-tuned on Stanford Cars. It predicts one of 196 fine-grained Stanford Cars classes, then derives vehicle make, model, and year from the predicted class metadata.
+## Final deployed checkpoint
+- **Checkpoint file:** `efficientnet_b3_stanford300_augv2_best.pt`
+- **Architecture:** EfficientNet-B3
+- **Input size:** 300 px
+- **Classes:** 196 Stanford Cars fine-grained classes
+- **Training data:** Stanford Cars training split
+- **Training augmentation:** augmentation v2
+- **Framework:** PyTorch + timm
+- **Checkpoint manifest:** `checkpoint_manifest.json`
+- **Current status:** final deployed candidate
+The model repo also keeps older checkpoints for comparison and rollback:
+- `efficientnet_b3_stanford300_best.pt` — previous EfficientNet-B3 checkpoint
+- `best.pt` — older ResNet18 baseline checkpoint
+## What the model predicts
+The model directly predicts a fine-grained class, for example:
+```text
+Dodge Charger SRT-8 2009
+```
+From that fine-grained prediction, the system derives:
+- **make** — e.g. `Dodge`
+- **model** — e.g. `Charger SRT-8`
+- **year** — e.g. `2009`
+Year is included as an auxiliary output, but it is not predicted by a separate year-regression or year-classification head. It is derived from the predicted fine-grained class metadata.
+## Validation results
+Final quantitative comparison is reported on the locked Stanford validation split using the same evaluation protocol for all compared models.
+| Transform        | Fine acc | Make acc | Model acc | Year acc | Top-3 acc | Top-5 acc |
+| ---------------- | -------: | -------: | --------: | -------: | --------: | --------: |
+| clean            |   0.7864 |   0.8692 |    0.7925 |   0.8913 |    0.9196 |    0.9521 |
+| robust_light     |   0.7882 |   0.8680 |    0.7944 |   0.8956 |    0.9159 |    0.9490 |
+| robust_hard      |   0.6839 |   0.7778 |    0.6900 |   0.8355 |    0.8600 |    0.9055 |
+| robust_occlusion |   0.6317 |   0.7317 |    0.6366 |   0.8048 |    0.8060 |    0.8600 |
+The augmentation-v2 EfficientNet-B3 model improved both clean validation accuracy and robustness compared with the earlier EfficientNet-B3 candidate.
+## Robustness evaluation
+The final model was evaluated under multiple image transforms:
+- **clean** — standard validation preprocessing
+- **robust_light** — mild production-like perturbations
+- **robust_hard** — stronger blur, lighting, color, and geometric perturbations
+- **robust_occlusion** — synthetic occlusion/erasing stress test
+These tests are not a replacement for real-world field validation, but they quantify how the model behaves under controlled distribution shifts.
+## CompCars status
+CompCars was inspected and used for external validation and reconnaissance, but it was not blindly merged into final training.
+The main reason is that Stanford Cars and CompCars have a significant domain and label-distribution gap:
+- Stanford Cars is a clean fine-grained benchmark with 196 make/model/year classes.
+- CompCars contains different image domains and different make/model taxonomies.
+- Exact Stanford Cars ↔ CompCars make/model/year overlap was too small and biased for safe blind merging.
+- Make-level external validation on CompCars showed a strong cross-domain performance drop.
+This confirmed that CompCars integration is a domain adaptation problem, not a simple data-merge task.
+Future work should build a verified Stanford Cars ↔ CompCars alias map, train on a controlled filtered subset, and validate on a true cross-domain holdout.
+## Held-out Stanford test status
+The local Stanford Cars test images were available and were used for qualitative API/demo smoke testing.
+However, the available `cars_test_annos.mat` file contained only bounding boxes and filenames:
+```text
+bbox_x1
+bbox_y1
+bbox_x2
+bbox_y2
+fname
+```
+It did not include class labels.
+The provided Kaggle mirror, `eduardo4jesus/stanford-cars-dataset`, was also checked. It included:
+```text
+cars_meta.mat
+cars_train_annos.mat
+cars_test_annos.mat
+```
+but did not provide:
+```text
+cars_test_annos_withlabels.mat
+cars_annos.mat
+```
+Final quantitative model comparison is therefore reported on the locked Stanford validation split using an identical protocol and seed for every compared model. This keeps model-to-model deltas valid. A labeled held-out Stanford test evaluation would be a straightforward extension if `cars_test_annos_withlabels.mat` is obtained.
+## Demo
+The model is deployed in a Hugging Face Space:
+- **Demo Space:** [https://huggingface.co/spaces/twincar-group2/twincar-demo](https://huggingface.co/spaces/twincar-group2/twincar-demo)
+The Space supports:
+- image upload
+- webcam/clipboard input
+- full-image prediction
+- make/model/year display
+- top-k predictions
+- optional YOLO crop comparison mode
+The YOLO cropper is experimental and default-off. The official prediction remains the full-image EfficientNet-B3 prediction.
+## API and code
+Project repository:
+- **GitHub:** [https://github.com/Hristijan-kiko/twincar](https://github.com/Hristijan-kiko/twincar)
+The GitHub repo includes:
+- reusable Python package
+- FastAPI inference endpoint
+- Gradio demo app
+- batch prediction script
+- training and evaluation scripts
+- robust evaluation reports
+- CI with linting and tests
+## W&B
+Training and experiment tracking:
+- **W&B project:** [https://wandb.ai/hzlatevskii-brainster-data-science-academy/twincar](https://wandb.ai/hzlatevskii-brainster-data-science-academy/twincar)
+## Intended use
+This model is intended for educational and prototype-level vehicle recognition experiments, especially make/model classification from car images similar to Stanford Cars.
+Appropriate uses:
+- fine-grained vehicle make/model recognition demo
+- model comparison and robustness analysis
+- prototype vehicle inspection workflow
+- academic/academy project demonstration
+## Limitations
+- The final model is trained on Stanford Cars, not on real drone/robot production footage.
+- CompCars showed a strong domain gap and was not blindly merged into final training.
+- True top-down drone views remain out-of-distribution.
+- Robustness tests use controlled synthetic perturbations, not full real-world field validation.
+- Year is derived from the fine-grained class label metadata, not learned as an independent year model.
+- The system assumes the uploaded image contains a vehicle.
+- Strong non-car/out-of-distribution rejection is not implemented yet.
+- Similar models and years can be confused because fine-grained vehicle classification often depends on subtle visual details.
+## Future work
+- Build a verified Stanford Cars ↔ CompCars alias map.
+- Fine-tune on a controlled CompCars surveillance subset.
+- Add real production/drone/robot images for target-domain validation.
+- Add non-car/out-of-distribution rejection.
+- Add detector-assisted preprocessing as a validated default only if it improves real metrics.
+- Explore multi-head prediction for make/model/year if independent outputs become necessary.