vehicle-keypoints

vehicle-keypoints - 14-point car pose estimation on CarFusion

14-keypoint vehicle pose estimation on the CarFusion (CMU) dataset - four wheels, four head- and tail-lights, four roof corners, the exhaust, and a body-centre reference point per car (canonical CarFusion schema, Reddy et al., CVPR 2018). Main weights (weights.pt) are an Ultralytics YOLO26-pose checkpoint; a ViTPose-S top-down baseline is published under the baseline/ subdirectory of this repo.

Synthetic-data sibling: kiselyovd/citysample-vehicle-keypoints-24pt - a 24-point model trained entirely on Unreal Engine 5 renders, from the ue5-vehicle-synth pipeline.

Metrics (test set, n=12761)

Model	OKS-mAP	OKS-mAP@50	PCK@0.05	Params	Notes
YOLO26-pose (ours)	50.4%	70.4%	76.1%	~3M	YOLO26n-pose, 30 ep
ViTPose-S (baseline)	0.1%	13.7%	-	85M	Top-down; 15 epochs, needs 100+

Usage

from huggingface_hub import hf_hub_download
from ultralytics import YOLO

ckpt = hf_hub_download(repo_id="kiselyovd/vehicle-keypoints", filename="weights.pt")
model = YOLO(ckpt)
results = model.predict("car.jpg")
for r in results:
    for box, kpts, score in zip(r.boxes.xywh, r.keypoints.data, r.boxes.conf):
        print(box.tolist(), score.item(), len(kpts))

The baseline weights (ViTPose-S, HF safetensors format + processor config) live under the baseline/ subdir of this repo and are loaded via transformers - see the GitHub README for the inference snippet.

Visualizations

14-point CarFusion keypoint schema (left) and predictions on test images (right):

14-point CarFusion keypoint schema Predicted keypoints on CarFusion test images

Source

Code: https://github.com/kiselyovd/vehicle-keypoints
Dataset: CarFusion - N. Dinesh Reddy, Minh Vo, Srinivasa Narasimhan, "CarFusion: Combining Point Tracking and Part Detection for Dynamic 3D Reconstruction of Vehicles", CVPR 2018. © Carnegie Mellon University.
Keypoint order (14): right_front_wheel, left_front_wheel, right_back_wheel, left_back_wheel, right_front_headlight, left_front_headlight, right_back_headlight, left_back_headlight, exhaust, right_front_top, left_front_top, right_back_top, left_back_top, center - naming follows the original CarFusion / Occlusion-Net reference (dineshreddy91/Occlusion_Net/lib/data_loader/datasets/keypoint.py).

Intended use

Research and educational artifact demonstrating modern keypoint-detection pipelines on a non-human class. Not intended for any safety-critical, autonomous-driving, or surveillance deployment - the model is trained on a single academic dataset and has not been validated for production use.

License

Code + weights: MIT (see LICENSE).
Dataset: CarFusion © Carnegie Mellon University - redistributed under the dataset's original terms; cite Reddy et al. 2018 if you use the weights for research.

Note: This model card was generated from the ml-project-template scaffold.

Downloads last month: 1,815

Evaluation results

oks_map on carfusion
self-reported

0.504
oks_map_50 on carfusion
self-reported

0.704
oks_map_75 on carfusion
self-reported

0.596
oks_map_medium on carfusion
self-reported

0.004
oks_map_large on carfusion
self-reported

0.513
pck_0.05 on carfusion
self-reported

0.761
test_size on carfusion
self-reported

12761.000
n_predictions on carfusion
self-reported

42706.000