vehicle-keypoints

vehicle-keypoints - 14-point car pose estimation on CarFusion

14-keypoint vehicle pose estimation on the CarFusion (CMU) dataset - four wheels, four head- and tail-lights, four roof corners, the exhaust, and a body-centre reference point per car (canonical CarFusion schema, Reddy et al., CVPR 2018). Main weights (weights.pt) are an Ultralytics YOLO26-pose checkpoint; a ViTPose-S top-down baseline is published under the baseline/ subdirectory of this repo.

Synthetic-data sibling: kiselyovd/citysample-vehicle-keypoints-24pt - a 24-point model trained entirely on Unreal Engine 5 renders, from the ue5-vehicle-synth pipeline.

Metrics (test set, n=12761)

Model OKS-mAP OKS-mAP@50 PCK@0.05 Params Notes
YOLO26-pose (ours) 50.4% 70.4% 76.1% ~3M YOLO26n-pose, 30 ep
ViTPose-S (baseline) 0.1% 13.7% - 85M Top-down; 15 epochs, needs 100+

Usage

from huggingface_hub import hf_hub_download
from ultralytics import YOLO

ckpt = hf_hub_download(repo_id="kiselyovd/vehicle-keypoints", filename="weights.pt")
model = YOLO(ckpt)
results = model.predict("car.jpg")
for r in results:
    for box, kpts, score in zip(r.boxes.xywh, r.keypoints.data, r.boxes.conf):
        print(box.tolist(), score.item(), len(kpts))

The baseline weights (ViTPose-S, HF safetensors format + processor config) live under the baseline/ subdir of this repo and are loaded via transformers - see the GitHub README for the inference snippet.

Visualizations

14-point CarFusion keypoint schema (left) and predictions on test images (right):

14-point CarFusion keypoint schema Predicted keypoints on CarFusion test images

Source

  • Code: https://github.com/kiselyovd/vehicle-keypoints
  • Dataset: CarFusion - N. Dinesh Reddy, Minh Vo, Srinivasa Narasimhan, "CarFusion: Combining Point Tracking and Part Detection for Dynamic 3D Reconstruction of Vehicles", CVPR 2018. © Carnegie Mellon University.
  • Keypoint order (14): right_front_wheel, left_front_wheel, right_back_wheel, left_back_wheel, right_front_headlight, left_front_headlight, right_back_headlight, left_back_headlight, exhaust, right_front_top, left_front_top, right_back_top, left_back_top, center - naming follows the original CarFusion / Occlusion-Net reference (dineshreddy91/Occlusion_Net/lib/data_loader/datasets/keypoint.py).

Intended use

Research and educational artifact demonstrating modern keypoint-detection pipelines on a non-human class. Not intended for any safety-critical, autonomous-driving, or surveillance deployment - the model is trained on a single academic dataset and has not been validated for production use.

License

  • Code + weights: MIT (see LICENSE).
  • Dataset: CarFusion © Carnegie Mellon University - redistributed under the dataset's original terms; cite Reddy et al. 2018 if you use the weights for research.

Note: This model card was generated from the ml-project-template scaffold.

Downloads last month
1,801
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Evaluation results