| | --- |
| | license: mit |
| | library_name: pytorch |
| | tags: |
| | - faster-rcnn |
| | - object-detection |
| | - computer-vision |
| | - pytorch |
| | - bdd100k |
| | - autonomous-driving |
| | - BDD 100K |
| | - from-scratch |
| | pipeline_tag: object-detection |
| | datasets: |
| | - bdd100k |
| | widget: |
| | - src: https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/bounding-boxes-sample.png |
| | example_title: "Sample Image" |
| | model-index: |
| | - name: faster-rcnn-bdd-vanilla |
| | results: |
| | - task: |
| | type: object-detection |
| | dataset: |
| | type: bdd100k |
| | name: Berkeley DeepDrive (BDD) 100K |
| | metrics: |
| | - type: mean_average_precision |
| | name: mAP |
| | value: "TBD" |
| | --- |
| | |
| | # Faster R-CNN - Berkeley DeepDrive (BDD) 100K Vanilla |
| |
|
| | Faster R-CNN model trained from scratch on Berkeley DeepDrive (BDD) 100K dataset for object detection in autonomous driving scenarios. |
| |
|
| | ## Model Details |
| |
|
| | - **Model Type**: Faster R-CNN Object Detection |
| | - **Dataset**: Berkeley DeepDrive (BDD) 100K |
| | - **Training Method**: trained from scratch |
| | - **Framework**: PyTorch |
| | - **Task**: Object Detection |
| |
|
| | ## Dataset Information |
| |
|
| | This model was trained on the **Berkeley DeepDrive (BDD) 100K** dataset, which contains the following object classes: |
| |
|
| | car, truck, bus, motorcycle, bicycle, person, traffic light, traffic sign, train, rider |
| |
|
| | ### Dataset-specific Details: |
| |
|
| | **Berkeley DeepDrive (BDD) 100K Dataset:** |
| | - 100,000+ driving images with diverse weather and lighting conditions |
| | - Designed for autonomous driving applications |
| | - Contains urban driving scenarios from multiple cities |
| | - Annotations include bounding boxes for vehicles, pedestrians, and traffic elements |
| |
|
| | ## Usage |
| |
|
| | This model can be used with PyTorch and common object detection frameworks: |
| |
|
| | ```python |
| | import torch |
| | import torchvision.transforms as transforms |
| | from PIL import Image |
| | |
| | # Load the model (example using torchvision) |
| | model = torch.load('path/to/model.pth') |
| | model.eval() |
| | |
| | # Prepare your image |
| | transform = transforms.Compose([ |
| | transforms.ToTensor(), |
| | ]) |
| | |
| | image = Image.open('path/to/image.jpg') |
| | image_tensor = transform(image).unsqueeze(0) |
| | |
| | # Run inference |
| | with torch.no_grad(): |
| | predictions = model(image_tensor) |
| | |
| | # Process results |
| | boxes = predictions[0]['boxes'] |
| | scores = predictions[0]['scores'] |
| | labels = predictions[0]['labels'] |
| | ``` |
| |
|
| | ## Model Performance |
| |
|
| | This model was trained from scratch on the Berkeley DeepDrive (BDD) 100K dataset using Faster R-CNN architecture. |
| |
|
| | ## Architecture |
| |
|
| | **Faster R-CNN** (Region-based Convolutional Neural Network) is a two-stage object detection framework: |
| |
|
| | 1. **Region Proposal Network (RPN)**: Generates object proposals |
| | 2. **Fast R-CNN detector**: Classifies proposals and refines bounding box coordinates |
| |
|
| | Key advantages: |
| | - High accuracy object detection |
| | - Precise localization |
| | - Good performance on small objects |
| | - Well-established architecture with extensive research backing |
| |
|
| | ## Intended Use |
| |
|
| | - **Primary Use**: Object detection in autonomous driving scenarios |
| | - **Suitable for**: Research, development, and deployment of object detection systems |
| | - **Limitations**: Performance may vary on images significantly different from the training distribution |
| |
|
| | ## Citation |
| |
|
| | If you use this model, please cite: |
| |
|
| | ```bibtex |
| | @article{ren2015faster, |
| | title={Faster r-cnn: Towards real-time object detection with region proposal networks}, |
| | author={Ren, Shaoqing and He, Kaiming and Girshick, Ross and Sun, Jian}, |
| | journal={Advances in neural information processing systems}, |
| | volume={28}, |
| | year={2015} |
| | } |
| | ``` |
| |
|
| | ## License |
| |
|
| | This model is released under the MIT License. |
| |
|
| | ## Keywords |
| |
|
| | Faster R-CNN, Object Detection, Computer Vision, BDD 100K, Autonomous Driving, Deep Learning, Two-Stage Detection |
| |
|