# Car Bounding Box Detection — Custom CNN From Scratch This repository contains a **custom Convolutional Neural Network (CNN)** trained **from scratch** for **car bounding box detection** on the **Stanford Cars Dataset**. The model predicts bounding boxes in normalized format: `[x_center, y_center, width, height]`. ## Features - Custom CNN architecture built from scratch - Bounding box regression only (no classification) - Balanced dataset with per-class sampling - Dataset split: **64% train, 16% validation, 20% test** - Advanced image augmentation (flip, rotation, brightness, contrast, crop) - Smooth L1 loss for bounding box regression - Fully GPU-compatible training and inference ## Dataset - **Source:** Stanford Cars Dataset (https://www.kaggle.com/datasets/eduardo4jesus/stanford-cars-dataset/data) - **Annotations used:** Bounding boxes only - Images resized to **416×416 pixels** ## Model Architecture - Multiple convolutional blocks with BatchNorm and ReLU - Dropout layers to prevent overfitting - Fully connected regression head - Sigmoid output to produce normalized coordinates - Output format: `[x_center, y_center, width, height]` ## Training - **Batch size:** 32 - **Optimizer:** AdamW - **Loss function:** Smooth L1 (CIoU Loss) - **Scheduler:** Cosine annealing LR - Training monitored with best validation IoU checkpointing ## Inference - The model can predict bounding boxes on any car image or video - Input images must be preprocessed and resized to **416×416** - Output: normalized `[x_center, y_center, width, height]` coordinates --- ## Example ## Citation If you use this model, please cite: ```bibtex @misc{car-bbox-detection-2025, title = {Car Bounding Box Detection — Custom CNN}, author = {Malek Messaoudi, Yassine Mhirsi}, year = {2025}, publisher = {Hugging Face}, howpublished = {\url{https://huggingface.co/Safe-Drive-TN/Car-detection-from-scratch}} } ``` ## License License : MIT