File size: 2,250 Bytes
6e91496
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
---
library_name: libreyolo
pipeline_tag: object-detection
license: mit
tags:
  - libreyolo
  - yolov9
  - visdrone
  - aerial-imagery
  - object-detection
datasets:
  - Voxel51/VisDrone2019-DET
---

# ander2221/visdrone-yolo9-preview

YOLOv9-t fine-tuned on VisDrone2019-DET aerial imagery using
[LibreYOLO](https://github.com/LibreYOLO/libreyolo). Ten classes
(pedestrian, people, bicycle, car, van, truck, tricycle, awning-tricycle,
bus, motor), top-down drone perspective.

**Companion use case:** [LibreYOLO/use-cases/visdrone-finetune](https://github.com/LibreYOLO/use-cases/tree/main/visdrone-finetune).

## Training

- size: `t`
- imgsz: `384`
- epochs: `5`
- dataset: VisDrone2019-DET via Voxel51's HuggingFace mirror
- compute: Apple Metal Performance Shaders (MPS, M-series GPU)

## Metrics

```json
{}
```

## Usage — Python

```python
from huggingface_hub import hf_hub_download
from libreyolo import LibreYOLO

ckpt = hf_hub_download(repo_id="ander2221/visdrone-yolo9-preview", filename="visdrone.pt")
model = LibreYOLO(ckpt)
result = model("aerial.jpg")
for box, cls, conf in zip(result.boxes.xyxy, result.boxes.cls, result.boxes.conf):
    print(box, ["pedestrian","people","bicycle","car","van","truck","tricycle","awning-tricycle","bus","motor"][int(cls)], float(conf))
```

## Usage — ONNX (browser, edge, cross-runtime)

```python
import onnxruntime as ort
from huggingface_hub import hf_hub_download

onnx = hf_hub_download(repo_id="ander2221/visdrone-yolo9-preview", filename="visdrone.onnx")
session = ort.InferenceSession(onnx, providers=["CPUExecutionProvider"])
# Preprocess image to (1, 3, 384, 384) float32 in [0,1] then:
out = session.run(None, {"images": preprocessed})
```

A live browser demo using this ONNX is at
https://libreyolo.github.io/use-cases/visdrone-finetune/demo/
(zero-install, runs locally in Chrome via WebGPU/onnxruntime-web).

## Classes (index → name)

| idx | name |
|---|---|
| 0 | pedestrian |
| 1 | people |
| 2 | bicycle |
| 3 | car |
| 4 | van |
| 5 | truck |
| 6 | tricycle |
| 7 | awning-tricycle |
| 8 | bus |
| 9 | motor |

## License

MIT (the model file). Dataset (VisDrone2019-DET) is governed by its own
[license terms](http://aiskyeye.com/) — please review for your use case.