Malek-Messaoudi's picture
Update README.md
f454e2d verified
# Car Bounding Box Detection — Custom CNN From Scratch
This repository contains a **custom Convolutional Neural Network (CNN)** trained **from scratch** for **car bounding box detection** on the **Stanford Cars Dataset**.
The model predicts bounding boxes in normalized format: `[x_center, y_center, width, height]`.
## Features
- Custom CNN architecture built from scratch
- Bounding box regression only (no classification)
- Balanced dataset with per-class sampling
- Dataset split: **64% train, 16% validation, 20% test**
- Advanced image augmentation (flip, rotation, brightness, contrast, crop)
- Smooth L1 loss for bounding box regression
- Fully GPU-compatible training and inference
## Dataset
- **Source:** Stanford Cars Dataset (https://www.kaggle.com/datasets/eduardo4jesus/stanford-cars-dataset/data)
- **Annotations used:** Bounding boxes only
- Images resized to **416×416 pixels**
## Model Architecture
- Multiple convolutional blocks with BatchNorm and ReLU
- Dropout layers to prevent overfitting
- Fully connected regression head
- Sigmoid output to produce normalized coordinates
- Output format: `[x_center, y_center, width, height]`
## Training
- **Batch size:** 32
- **Optimizer:** AdamW
- **Loss function:** Smooth L1 (CIoU Loss)
- **Scheduler:** Cosine annealing LR
- Training monitored with best validation IoU checkpointing
## Inference
- The model can predict bounding boxes on any car image or video
- Input images must be preprocessed and resized to **416×416**
- Output: normalized `[x_center, y_center, width, height]` coordinates
---
## Example
<img src="https://cdn-uploads.huggingface.co/production/uploads/67bc31088cf27f32cbcf927f/h286qIktC-H5CkxuO-YvH.jpeg" width="400"/>
## Citation
If you use this model, please cite:
```bibtex
@misc{car-bbox-detection-2025,
title = {Car Bounding Box Detection — Custom CNN},
author = {Malek Messaoudi, Yassine Mhirsi},
year = {2025},
publisher = {Hugging Face},
howpublished = {\url{https://huggingface.co/Safe-Drive-TN/Car-detection-from-scratch}}
}
```
## License
License : MIT