Abstract

This work builds upon the Basketball Action Recognition Dataset (BARD), originally introduced to enable supervised learning for primary action recognition in NBA game footage. However, BARD's initial design lacks the granular annotations required to develop multi-stage computer vision pipelines involving object detection, jersey number recognition (JNR) and team attribution. To address these limitations, we present E-BARD (Extended Basketball Action Recognition Dataset), which bridges the gap between isolated action recognition and end-to-end scene-level reasoning through three key contributions.First, we introduce a new set of interrelated datasets that augment the original BARD videos with dense visual annotations. This includes detection data for key entities (ball, hoop, referee, player), team attribution based on uniform colors and JNR, all integrated to directly support and enrich the original action captions. Second, we establish a comprehensive benchmark for these specific visual understanding tasks using representative state-of-the-art models. We evaluate YOLO and RF-DETR for object detection; CLIP, SigLIP2, FashionCLIP, and the Perception Encoder for team color attribution; and olmOCR, Qwen2.5-VL-3B, and Qwen2.5-VL-7B for JNR. Finally, we propose a holistic, integrated approach based on Qwen2.5-VL, demonstrating the capacity of a unified multimodal framework to jointly address all subtasks simultaneously. Ultimately, E-BARD provides a comprehensive benchmark for multi-task basketball video understanding.

Model Card for E-BARD Basketball Object Detection Models

This repository hosts two fine-tuned object detection models:

YOLOv8n
RF-DETR Nano

Both models are trained to detect key entities in basketball footage:

Basketball
Hoop
Player
Referee

These models were developed as part of the E-BARD (Extended Basketball Action Recognition Dataset) project to support end-to-end basketball scene understanding pipelines.

Model Details

Developed by: Gabriele Giudici (Author of E-BARD)

Model Type: Object Detection

YOLOv8n

Lightweight CNN detector
~3.15M parameters

RF-DETR Nano

Lightweight transformer-based detector
~30.5M parameters

License: CC-BY-4.0

Finetuned from:

Base YOLOv8n
Base RF-DETR Nano

Model Sources

Code Repository
https://github.com/GabrieleGiudic/E-BARD

Original BARD Repository
https://github.com/GabrieleGiudic/BARD

Dataset Repository
https://huggingface.co/datasets/GabrieleGiudici/E-BARD-detection

Paper
E-BARD: A Multi-Task Extension of the Basketball Action Recognition Dataset for Player Detection, Team Attribution and Jersey Number Recognition.

Uses

Direct Use

These models detect four basketball entities in a single frame:

Basketball
Basketball hoop
Basketball player
Referee

Downstream Use

Detections can be integrated into sports analytics pipelines, including:

Multi-object tracking (e.g., ByteTrack)
Jersey number recognition (JNR)
Team color attribution
Tactical analysis
Event understanding

Bias, Risks, and Limitations

Models were trained on 720p footage downscaled to 704×704.
Performance may degrade on lower resolutions or different aspect ratios.
Dataset is derived from 2024–2025 NBA season footage, potentially biasing the models toward:
- NBA court layouts
- broadcast camera angles
- lighting conditions
- uniform styles

Possible limitations:

Reduced performance on lower-tier leagues
Reduced performance on street basketball environments

Model-specific limitations

YOLOv8n

Struggles with very small objects like the basketball
Recall@50: 0.566

RF-DETR Nano

Conservative detection behavior
Prioritizes precision over recall

Training Details

Training Data

The models were trained on the E-BARD Detection Dataset, derived from 60 BARD full-game recordings.

Dataset statistics

Total Frames: 1,800
Frames per game: 30
Total Annotations: 22,210

Class Distribution

Class	Instances
Players	15,296
Referees	3,853
Hoops	1,565
Basketballs	1,496

Dataset split

Training: 80%
Validation: 10%
Test: 10%

Training Procedure

Both models were trained using:

Mixed precision (AMP)
Early stopping

YOLOv8n

Epochs: 50
Resolution: 704×704
Batch Size: 64 (paper) / 32 (training script)
Augmentations:
- Mosaic (1.0)
- Copy-Paste (0.5)
- RandAugment

RF-DETR Nano

Epochs: 50
Resolution: 704×704
Batch Size: 16
Learning Rate: 1e-4

Evaluation

Testing Data

Evaluation was performed on the 10% held-out test split of E-BARD.

Metrics used:

Precision
Recall
F1-score
IoU threshold = 0.50

Results

YOLOv8n consistently outperformed RF-DETR Nano across most classes.

Per-Class Performance (@ IoU 0.5)

Class	Metric	YOLOv8n	RF-DETR Nano
Basketball	Precision	0.811	0.845
Basketball	Recall	0.566	0.322
Basketball	F1	0.667	0.467
Hoop	Precision	0.993	0.944
Hoop	Recall	0.937	0.742
Hoop	F1	0.964	0.831
Player	Precision	0.952	0.962
Player	Recall	0.949	0.908
Player	F1	0.950	0.934
Referee	Precision	0.927	0.953
Referee	Recall	0.930	0.794
Referee	F1	0.929	0.867

Code Examples

YOLOv8n Inference

from ultralytics import YOLO

yolo_model = YOLO("model/BODD_yolov8n_0001.pt")

yolo_results = yolo_model.predict(
    source="data/yolo/test/images",
    imgsz=704,
    device="cuda",
    conf=0.25,
    iou=0.5
)

RF-DETR Nano Inference

from rfdetr import RFDETRNano
from PIL import Image

rfdetr_model = RFDETRNano(
    pretrain_weights="model/BODD_rf-detr-nano_0000/checkpoint_best_total.pth"
)

img = Image.open("path/to/image.jpg").convert("RGB")

detections = rfdetr_model.predict(
    img,
    resolution=704,
    conf_threshold=0.25
)

Full Evaluation Script

Look at evaluation folder https://github.com/GabrieleGiudic/E-BARD/detection/eval/yolo_vs_detr.py

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for GabrieleGiudici/E-BARD-detection-models

Base model

Ultralytics/YOLOv8

Finetuned

(163)

this model

GabrieleGiudici
/

E-BARD-detection-models

Abstract

Model Card for E-BARD Basketball Object Detection Models

Model Details

YOLOv8n

RF-DETR Nano

Model Sources

Uses

Direct Use

Downstream Use

Bias, Risks, and Limitations

Model-specific limitations

Training Details

Training Data

Training Procedure

YOLOv8n

RF-DETR Nano

Evaluation

Testing Data

Results

Per-Class Performance (@ IoU 0.5)

Code Examples

YOLOv8n Inference

RF-DETR Nano Inference

Full Evaluation Script

Model tree for GabrieleGiudici/E-BARD-detection-models

Dataset used to train GabrieleGiudici/E-BARD-detection-models