File size: 2,136 Bytes
b1f7478
55c2d09
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
f454e2d
b1f7478
55c2d09
b1f7478
55c2d09
 
 
 
 
 
 
 
 
 
 
 
 
 
b1f7478
55c2d09
f454e2d
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74

# Car Bounding Box Detection — Custom CNN From Scratch

This repository contains a **custom Convolutional Neural Network (CNN)** trained **from scratch** for **car bounding box detection** on the **Stanford Cars Dataset**.  
The model predicts bounding boxes in normalized format: `[x_center, y_center, width, height]`.


## Features

- Custom CNN architecture built from scratch  
- Bounding box regression only (no classification)  
- Balanced dataset with per-class sampling  
- Dataset split: **64% train, 16% validation, 20% test**  
- Advanced image augmentation (flip, rotation, brightness, contrast, crop)  
- Smooth L1 loss for bounding box regression  
- Fully GPU-compatible training and inference


## Dataset

- **Source:** Stanford Cars Dataset (https://www.kaggle.com/datasets/eduardo4jesus/stanford-cars-dataset/data)
- **Annotations used:** Bounding boxes only  
- Images resized to **416×416 pixels**  


## Model Architecture

- Multiple convolutional blocks with BatchNorm and ReLU  
- Dropout layers to prevent overfitting  
- Fully connected regression head  
- Sigmoid output to produce normalized coordinates  
- Output format: `[x_center, y_center, width, height]`


## Training

- **Batch size:** 32  
- **Optimizer:** AdamW  
- **Loss function:** Smooth L1 (CIoU Loss)  
- **Scheduler:** Cosine annealing LR  
- Training monitored with best validation IoU checkpointing


## Inference

- The model can predict bounding boxes on any car image  or video 
- Input images must be preprocessed and resized to **416×416**  
- Output: normalized `[x_center, y_center, width, height]` coordinates  

---

## Example

<img src="https://cdn-uploads.huggingface.co/production/uploads/67bc31088cf27f32cbcf927f/h286qIktC-H5CkxuO-YvH.jpeg" width="400"/>


## Citation

If you use this model, please cite:

```bibtex
@misc{car-bbox-detection-2025,
  title = {Car Bounding Box Detection — Custom CNN},
  author = {Malek Messaoudi, Yassine Mhirsi},
  year = {2025},
  publisher = {Hugging Face},
  howpublished = {\url{https://huggingface.co/Safe-Drive-TN/Car-detection-from-scratch}}
}

```

## License 

License : MIT