gloriforge aivertex95827 commited on
Commit
fa40fef
Β·
0 Parent(s):

Duplicate from aivertex95827/turbo3

Browse files

Co-authored-by: Jonas <aivertex95827@users.noreply.huggingface.co>

.gitattributes ADDED
@@ -0,0 +1,36 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ *.7z filter=lfs diff=lfs merge=lfs -text
2
+ *.arrow filter=lfs diff=lfs merge=lfs -text
3
+ *.bin filter=lfs diff=lfs merge=lfs -text
4
+ *.bz2 filter=lfs diff=lfs merge=lfs -text
5
+ *.ckpt filter=lfs diff=lfs merge=lfs -text
6
+ *.ftz filter=lfs diff=lfs merge=lfs -text
7
+ *.gz filter=lfs diff=lfs merge=lfs -text
8
+ *.h5 filter=lfs diff=lfs merge=lfs -text
9
+ *.joblib filter=lfs diff=lfs merge=lfs -text
10
+ *.lfs.* filter=lfs diff=lfs merge=lfs -text
11
+ *.mlmodel filter=lfs diff=lfs merge=lfs -text
12
+ *.model filter=lfs diff=lfs merge=lfs -text
13
+ *.msgpack filter=lfs diff=lfs merge=lfs -text
14
+ *.npy filter=lfs diff=lfs merge=lfs -text
15
+ *.npz filter=lfs diff=lfs merge=lfs -text
16
+ *.onnx filter=lfs diff=lfs merge=lfs -text
17
+ *.ot filter=lfs diff=lfs merge=lfs -text
18
+ *.parquet filter=lfs diff=lfs merge=lfs -text
19
+ *.pb filter=lfs diff=lfs merge=lfs -text
20
+ *.pickle filter=lfs diff=lfs merge=lfs -text
21
+ *.pkl filter=lfs diff=lfs merge=lfs -text
22
+ *.pt filter=lfs diff=lfs merge=lfs -text
23
+ *.pth filter=lfs diff=lfs merge=lfs -text
24
+ *.rar filter=lfs diff=lfs merge=lfs -text
25
+ *.safetensors filter=lfs diff=lfs merge=lfs -text
26
+ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
27
+ *.tar.* filter=lfs diff=lfs merge=lfs -text
28
+ *.tar filter=lfs diff=lfs merge=lfs -text
29
+ *.tflite filter=lfs diff=lfs merge=lfs -text
30
+ *.tgz filter=lfs diff=lfs merge=lfs -text
31
+ *.wasm filter=lfs diff=lfs merge=lfs -text
32
+ *.xz filter=lfs diff=lfs merge=lfs -text
33
+ *.zip filter=lfs diff=lfs merge=lfs -text
34
+ *.zst filter=lfs diff=lfs merge=lfs -text
35
+ *tfevents* filter=lfs diff=lfs merge=lfs -text
36
+ osnet_model.pth.tar-100 filter=lfs diff=lfs merge=lfs -text
README.md ADDED
@@ -0,0 +1,92 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # πŸš€ Example Chute for Turbovision πŸͺ‚
2
+
3
+ This repository demonstrates how to deploy a **Chute** via the **Turbovision CLI**, hosted on **Hugging Face Hub**.
4
+ It serves as a minimal example showcasing the required structure and workflow for integrating machine learning models, preprocessing, and orchestration into a reproducible Chute environment.
5
+
6
+ ## Repository Structure
7
+ The following two files **must be present** (in their current locations) for a successful deployment β€” their content can be modified as needed:
8
+
9
+ | File | Purpose |
10
+ |------|----------|
11
+ | `miner.py` | Defines the ML model type(s), orchestration, and all pre/postprocessing logic. |
12
+ | `config.yml` | Specifies machine configuration (e.g., GPU type, memory, environment variables). |
13
+
14
+ Other files β€” e.g., model weights, utility scripts, or dependencies β€” are **optional** and can be included as needed for your model. Note: Any required assets must be defined or contained **within this repo**, which is fully open-source, since all network-related operations (downloading challenge data, weights, etc.) are disabled **inside the Chute**
15
+
16
+ ## Overview
17
+
18
+ Below is a high-level diagram showing the interaction between Huggingface, Chutes and Turbovision:
19
+
20
+ ![](../images/miner.png)
21
+
22
+ ## Local Testing
23
+ After editing the `config.yml` and `miner.py` and saving it into your Huggingface Repo, you will want to test it works locally.
24
+
25
+ 1. Copy the file `scorevision/chute_tmeplate/turbovision_chute.py.j2` as a python file called `my_chute.py` and fill in the missing variables:
26
+ ```python
27
+ HF_REPO_NAME = "{{ huggingface_repository_name }}"
28
+ HF_REPO_REVISION = "{{ huggingface_repository_revision }}"
29
+ CHUTES_USERNAME = "{{ chute_username }}"
30
+ CHUTE_NAME = "{{ chute_name }}"
31
+ ```
32
+
33
+ 2. Run the following command to build the chute locally (Caution: there are known issues with the docker location when running this on a mac)
34
+ ```bash
35
+ chutes build my_chute:chute --local --public
36
+ ```
37
+
38
+ 3. Run the name of the docker image just built (i.e. `CHUTE_NAME`) and enter it
39
+ ```bash
40
+ docker run -p 8000:8000 -e CHUTES_EXECUTION_CONTEXT=REMOTE -it <image-name> /bin/bash
41
+ ```
42
+
43
+ 4. Run the file from within the container
44
+ ```bash
45
+ chutes run my_chute:chute --dev --debug
46
+ ```
47
+
48
+ 5. In another terminal, test the local endpoints to ensure there are no bugs
49
+ ```bash
50
+ curl -X POST http://localhost:8000/health -d '{}'
51
+ curl -X POST http://localhost:8000/predict -d '{"url": "https://scoredata.me/2025_03_14/35ae7a/h1_0f2ca0.mp4","meta": {}}'
52
+ ```
53
+
54
+ ## Live Testing
55
+ 1. If you have any chute with the same name (ie from a previous deployment), ensure you delete that first (or you will get an error when trying to build).
56
+ ```bash
57
+ chutes chutes list
58
+ ```
59
+ Take note of the chute id that you wish to delete (if any)
60
+ ```bash
61
+ chutes chutes delete <chute-id>
62
+ ```
63
+
64
+ You should also delete its associated image
65
+ ```bash
66
+ chutes images list
67
+ ```
68
+ Take note of the chute image id
69
+ ```bash
70
+ chutes images delete <chute-image-id>
71
+ ```
72
+
73
+ 2. Use Turbovision's CLI to build, deploy and commit on-chain (Note: you can skip the on-chain commit using `--no-commit`. You can also specify a past huggingface revision to point to using `--revision` and/or the local files you want to upload to your huggingface repo using `--model-path`)
74
+ ```bash
75
+ sv -vv push
76
+ ```
77
+
78
+ 3. When completed, warm up the chute (if its cold 🧊). (You can confirm its status using `chutes chutes list` or `chutes chutes get <chute-id>` if you already know its id). Note: Warming up can sometimes take a while but if the chute runs without errors (should be if you've tested locally first) and there are sufficient nodes (i.e. machines) available matching the `config.yml` you specified, the chute should become hot πŸ”₯!
79
+ ```bash
80
+ chutes warmup <chute-id>
81
+ ```
82
+
83
+ 4. Test the chute's endpoints
84
+ ```bash
85
+ curl -X POST https://<YOUR-CHUTE-SLUG>.chutes.ai/health -d '{}' -H "Authorization: Bearer $CHUTES_API_KEY"
86
+ curl -X POST https://<YOUR-CHUTE-SLUG>.chutes.ai/predict -d '{"url": "https://scoredata.me/2025_03_14/35ae7a/h1_0f2ca0.mp4","meta": {}}' -H "Authorization: Bearer $CHUTES_API_KEY"
87
+ ```
88
+
89
+ 5. Test what your chute would get on a validator (this also applies any validation/integrity checks which may fail if you did not use the Turbovision CLI above to deploy the chute)
90
+ ```bash
91
+ sv -vv run-once
92
+ ```
chute_config.yml ADDED
@@ -0,0 +1,29 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ Image:
2
+ from_base: parachutes/python:3.12
3
+ run_command:
4
+ - pip install --upgrade setuptools wheel
5
+ - pip install --index-url https://download.pytorch.org/whl/cu128 torch torchvision
6
+ - pip install "ultralytics==8.3.222" "opencv-python-headless" "numpy" "pydantic"
7
+ - pip install scikit-learn
8
+ - pip install onnxruntime-gpu
9
+ set_workdir: /app
10
+ readme: "Image for chutes"
11
+
12
+ NodeSelector:
13
+ gpu_count: 1
14
+ min_vram_gb_per_gpu: 24
15
+ min_memory_gb: 32
16
+ min_cpu_count: 32
17
+
18
+ exclude:
19
+ - "5090"
20
+ - b200
21
+ - h200
22
+ - mi300x
23
+
24
+ Chute:
25
+ timeout_seconds: 900
26
+ concurrency: 4
27
+ max_instances: 5
28
+ scaling_threshold: 0.3
29
+ shutdown_after_seconds: 600000
hrnetv2_w48.yaml ADDED
@@ -0,0 +1,35 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ MODEL:
2
+ IMAGE_SIZE: [960, 540]
3
+ NUM_JOINTS: 58
4
+ PRETRAIN: ''
5
+ EXTRA:
6
+ FINAL_CONV_KERNEL: 1
7
+ STAGE1:
8
+ NUM_MODULES: 1
9
+ NUM_BRANCHES: 1
10
+ BLOCK: BOTTLENECK
11
+ NUM_BLOCKS: [4]
12
+ NUM_CHANNELS: [64]
13
+ FUSE_METHOD: SUM
14
+ STAGE2:
15
+ NUM_MODULES: 1
16
+ NUM_BRANCHES: 2
17
+ BLOCK: BASIC
18
+ NUM_BLOCKS: [4, 4]
19
+ NUM_CHANNELS: [48, 96]
20
+ FUSE_METHOD: SUM
21
+ STAGE3:
22
+ NUM_MODULES: 4
23
+ NUM_BRANCHES: 3
24
+ BLOCK: BASIC
25
+ NUM_BLOCKS: [4, 4, 4]
26
+ NUM_CHANNELS: [48, 96, 192]
27
+ FUSE_METHOD: SUM
28
+ STAGE4:
29
+ NUM_MODULES: 3
30
+ NUM_BRANCHES: 4
31
+ BLOCK: BASIC
32
+ NUM_BLOCKS: [4, 4, 4, 4]
33
+ NUM_CHANNELS: [48, 96, 192, 384]
34
+ FUSE_METHOD: SUM
35
+
keypoint_detect.pt ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:7ea78fa76aaf94976a8eca428d6e3c59697a93430cba1a4603e20284b61f5113
3
+ size 264964645
miner.py ADDED
@@ -0,0 +1,1073 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ from pathlib import Path
2
+ from concurrent.futures import ThreadPoolExecutor
3
+ from ultralytics import YOLO
4
+ from numpy import ndarray
5
+ from pydantic import BaseModel
6
+ from typing import List, Tuple, Optional, Dict
7
+ import numpy as np
8
+ import cv2
9
+ from sklearn.cluster import KMeans
10
+ import torch
11
+ import torch.nn as nn
12
+ import torch.nn.functional as F
13
+ import yaml
14
+ import gc
15
+ import os
16
+ import sys
17
+ from collections import OrderedDict, defaultdict
18
+ from PIL import Image
19
+ import torchvision.transforms as T
20
+
21
+ # ── Grass / kit helpers ────────────────────────────────
22
+
23
+ def get_grass_color(img: np.ndarray) -> Tuple[int, int, int]:
24
+ if img is None or img.size == 0:
25
+ return (0, 0, 0)
26
+ hsv = cv2.cvtColor(img, cv2.COLOR_BGR2HSV)
27
+ lower_green = np.array([30, 40, 40])
28
+ upper_green = np.array([80, 255, 255])
29
+ mask = cv2.inRange(hsv, lower_green, upper_green)
30
+ grass_color = cv2.mean(img, mask=mask)
31
+ return grass_color[:3]
32
+
33
+ def get_players_boxes(result):
34
+ players_imgs, players_boxes = [], []
35
+ for box in result.boxes:
36
+ label = int(box.cls.cpu().numpy()[0])
37
+ if label == 2:
38
+ x1, y1, x2, y2 = map(int, box.xyxy[0].cpu().numpy())
39
+ crop = result.orig_img[y1:y2, x1:x2]
40
+ if crop.size > 0:
41
+ players_imgs.append(crop)
42
+ players_boxes.append((x1, y1, x2, y2))
43
+ return players_imgs, players_boxes
44
+
45
+ def get_kits_colors(players, grass_hsv=None, frame=None):
46
+ kits_colors = []
47
+ if grass_hsv is None:
48
+ grass_color = get_grass_color(frame)
49
+ grass_hsv = cv2.cvtColor(np.uint8([[list(grass_color)]]), cv2.COLOR_BGR2HSV)
50
+ for player_img in players:
51
+ hsv = cv2.cvtColor(player_img, cv2.COLOR_BGR2HSV)
52
+ lower_green = np.array([grass_hsv[0, 0, 0] - 10, 40, 40])
53
+ upper_green = np.array([grass_hsv[0, 0, 0] + 10, 255, 255])
54
+ mask = cv2.inRange(hsv, lower_green, upper_green)
55
+ mask = cv2.bitwise_not(mask)
56
+ upper_mask = np.zeros(player_img.shape[:2], np.uint8)
57
+ upper_mask[0:player_img.shape[0] // 2, :] = 255
58
+ mask = cv2.bitwise_and(mask, upper_mask)
59
+ kit_color = np.array(cv2.mean(player_img, mask=mask)[:3])
60
+ kits_colors.append(kit_color)
61
+ return kits_colors
62
+
63
+
64
+ # ── OSNet team classification (turbo_7 style) ────────────────
65
+
66
+ TEAM_1_ID = 6
67
+ TEAM_2_ID = 7
68
+ PLAYER_CLS_ID = 2
69
+ _OSNET_MODEL = None
70
+ osnet_weight_path = None
71
+
72
+ OSNET_IMAGE_SIZE = (64, 32) # (height, width)
73
+ OSNET_PREPROCESS = T.Compose([
74
+ T.Resize(OSNET_IMAGE_SIZE),
75
+ T.ToTensor(),
76
+ T.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]),
77
+ ])
78
+
79
+
80
+ def _crop_upper_body(frame: ndarray, box: "BoundingBox") -> ndarray:
81
+ return frame[
82
+ max(0, box.y1):max(0, box.y2),
83
+ max(0, box.x1):max(0, box.x2)
84
+ ]
85
+
86
+
87
+ def _preprocess_osnet(crop: ndarray) -> torch.Tensor:
88
+ rgb = cv2.cvtColor(crop, cv2.COLOR_BGR2RGB)
89
+ pil = Image.fromarray(rgb)
90
+ return OSNET_PREPROCESS(pil)
91
+
92
+
93
+ def _filter_player_boxes(boxes: List["BoundingBox"]) -> List["BoundingBox"]:
94
+ return [b for b in boxes if b.cls_id == PLAYER_CLS_ID]
95
+
96
+
97
+ def _extract_osnet_embeddings(
98
+ frames: List[ndarray],
99
+ batch_boxes: Dict[int, List["BoundingBox"]],
100
+ device: str = "cuda",
101
+ ) -> Tuple[Optional[ndarray], Optional[List["BoundingBox"]]]:
102
+ global _OSNET_MODEL
103
+ crops = []
104
+ meta = []
105
+ sorted_frame_ids = sorted(batch_boxes.keys())
106
+ for idx, frame_idx in enumerate(sorted_frame_ids):
107
+ frame = frames[idx] if idx < len(frames) else None
108
+ if frame is None:
109
+ continue
110
+ boxes = batch_boxes[frame_idx]
111
+ players = _filter_player_boxes(boxes)
112
+ for box in players:
113
+ crop = _crop_upper_body(frame, box)
114
+ if crop.size == 0:
115
+ continue
116
+ crops.append(_preprocess_osnet(crop))
117
+ meta.append(box)
118
+ if not crops:
119
+ return None, None
120
+ batch = torch.stack(crops).to(device).float()
121
+ with torch.inference_mode():
122
+ embeddings = _OSNET_MODEL(batch)
123
+ del batch
124
+ embeddings = embeddings.cpu().numpy()
125
+ return embeddings, meta
126
+
127
+
128
+ def _aggregate_by_track(
129
+ embeddings: ndarray,
130
+ meta: List["BoundingBox"],
131
+ ) -> Tuple[ndarray, List["BoundingBox"]]:
132
+ track_map = defaultdict(list)
133
+ box_map = {}
134
+ for emb, box in zip(embeddings, meta):
135
+ key = box.track_id if box.track_id is not None else id(box)
136
+ track_map[key].append(emb)
137
+ box_map[key] = box
138
+ agg_embeddings = []
139
+ agg_boxes = []
140
+ for key, embs in track_map.items():
141
+ mean_emb = np.mean(embs, axis=0)
142
+ norm = np.linalg.norm(mean_emb)
143
+ if norm > 1e-12:
144
+ mean_emb /= norm
145
+ agg_embeddings.append(mean_emb)
146
+ agg_boxes.append(box_map[key])
147
+ return np.array(agg_embeddings), agg_boxes
148
+
149
+
150
+ def _update_team_ids(boxes: List["BoundingBox"], labels: ndarray) -> None:
151
+ for box, label in zip(boxes, labels):
152
+ box.cls_id = TEAM_1_ID if label == 0 else TEAM_2_ID
153
+
154
+
155
+ def _classify_teams_batch(
156
+ frames: List[ndarray],
157
+ batch_boxes: Dict[int, List["BoundingBox"]],
158
+ device: str = "cuda",
159
+ ) -> None:
160
+ embeddings, meta = _extract_osnet_embeddings(frames, batch_boxes, device)
161
+ if embeddings is None:
162
+ return
163
+ embeddings, agg_boxes = _aggregate_by_track(embeddings, meta)
164
+ n = len(embeddings)
165
+ if n == 0:
166
+ return
167
+ if n == 1:
168
+ agg_boxes[0].cls_id = TEAM_1_ID
169
+ return
170
+ kmeans = KMeans(n_clusters=2, n_init=2, random_state=42)
171
+ kmeans.fit(embeddings)
172
+ centroids = kmeans.cluster_centers_
173
+ c0, c1 = centroids[0], centroids[1]
174
+ norm_0 = np.linalg.norm(c0)
175
+ norm_1 = np.linalg.norm(c1)
176
+ similarity = np.dot(c0, c1) / (norm_0 * norm_1 + 1e-12)
177
+ if similarity > 0.95:
178
+ for b in agg_boxes:
179
+ b.cls_id = TEAM_1_ID
180
+ return
181
+ if norm_0 <= norm_1:
182
+ kmeans.labels_ = 1 - kmeans.labels_
183
+ _update_team_ids(agg_boxes, kmeans.labels_)
184
+
185
+
186
+ class ConvLayer(nn.Module):
187
+ def __init__(self, in_channels, out_channels, kernel_size, stride=1, padding=0, groups=1, IN=False):
188
+ super().__init__()
189
+ self.conv = nn.Conv2d(in_channels, out_channels, kernel_size, stride=stride, padding=padding, bias=False, groups=groups)
190
+ self.bn = nn.InstanceNorm2d(out_channels, affine=True) if IN else nn.BatchNorm2d(out_channels)
191
+ self.relu = nn.ReLU()
192
+
193
+ def forward(self, x):
194
+ return self.relu(self.bn(self.conv(x)))
195
+
196
+
197
+ class Conv1x1(nn.Module):
198
+ def __init__(self, in_channels, out_channels, stride=1, groups=1):
199
+ super().__init__()
200
+ self.conv = nn.Conv2d(in_channels, out_channels, 1, stride=stride, padding=0, bias=False, groups=groups)
201
+ self.bn = nn.BatchNorm2d(out_channels)
202
+ self.relu = nn.ReLU()
203
+
204
+ def forward(self, x):
205
+ return self.relu(self.bn(self.conv(x)))
206
+
207
+
208
+ class Conv1x1Linear(nn.Module):
209
+ def __init__(self, in_channels, out_channels, stride=1, bn=True):
210
+ super().__init__()
211
+ self.conv = nn.Conv2d(in_channels, out_channels, 1, stride=stride, padding=0, bias=False)
212
+ self.bn = nn.BatchNorm2d(out_channels) if bn else None
213
+
214
+ def forward(self, x):
215
+ x = self.conv(x)
216
+ return self.bn(x) if self.bn is not None else x
217
+
218
+
219
+ class Conv3x3(nn.Module):
220
+ def __init__(self, in_channels, out_channels, stride=1, groups=1):
221
+ super().__init__()
222
+ self.conv = nn.Conv2d(in_channels, out_channels, 3, stride=stride, padding=1, bias=False, groups=groups)
223
+ self.bn = nn.BatchNorm2d(out_channels)
224
+ self.relu = nn.ReLU()
225
+
226
+ def forward(self, x):
227
+ return self.relu(self.bn(self.conv(x)))
228
+
229
+
230
+ class LightConv3x3(nn.Module):
231
+ def __init__(self, in_channels, out_channels):
232
+ super().__init__()
233
+ self.conv1 = nn.Conv2d(in_channels, out_channels, 1, stride=1, padding=0, bias=False)
234
+ self.conv2 = nn.Conv2d(out_channels, out_channels, 3, stride=1, padding=1, bias=False, groups=out_channels)
235
+ self.bn = nn.BatchNorm2d(out_channels)
236
+ self.relu = nn.ReLU()
237
+
238
+ def forward(self, x):
239
+ x = self.conv1(x)
240
+ x = self.conv2(x)
241
+ return self.relu(self.bn(x))
242
+
243
+
244
+ class LightConvStream(nn.Module):
245
+ def __init__(self, in_channels, out_channels, depth):
246
+ super().__init__()
247
+ layers = [LightConv3x3(in_channels, out_channels)]
248
+ for _ in range(depth - 1):
249
+ layers.append(LightConv3x3(out_channels, out_channels))
250
+ self.layers = nn.Sequential(*layers)
251
+
252
+ def forward(self, x):
253
+ return self.layers(x)
254
+
255
+
256
+ class ChannelGate(nn.Module):
257
+ def __init__(self, in_channels, num_gates=None, return_gates=False, gate_activation='sigmoid', reduction=16, layer_norm=False):
258
+ super().__init__()
259
+ if num_gates is None:
260
+ num_gates = in_channels
261
+ self.return_gates = return_gates
262
+ self.global_avgpool = nn.AdaptiveAvgPool2d(1)
263
+ self.fc1 = nn.Conv2d(in_channels, in_channels // reduction, kernel_size=1, bias=True, padding=0)
264
+ self.norm1 = nn.LayerNorm((in_channels // reduction, 1, 1)) if layer_norm else None
265
+ self.relu = nn.ReLU()
266
+ self.fc2 = nn.Conv2d(in_channels // reduction, num_gates, kernel_size=1, bias=True, padding=0)
267
+ self.gate_activation = nn.Sigmoid() if gate_activation == 'sigmoid' else nn.ReLU()
268
+
269
+ def forward(self, x):
270
+ input = x
271
+ x = self.global_avgpool(x)
272
+ x = self.fc1(x)
273
+ if self.norm1 is not None:
274
+ x = self.norm1(x)
275
+ x = self.relu(x)
276
+ x = self.fc2(x)
277
+ if self.gate_activation is not None:
278
+ x = self.gate_activation(x)
279
+ return x if self.return_gates else input * x
280
+
281
+
282
+ class OSBlockX1(nn.Module):
283
+ def __init__(self, in_channels, out_channels, IN=False, bottleneck_reduction=4):
284
+ super().__init__()
285
+ mid_channels = out_channels // bottleneck_reduction
286
+ self.conv1 = Conv1x1(in_channels, mid_channels)
287
+ self.conv2a = LightConv3x3(mid_channels, mid_channels)
288
+ self.conv2b = nn.Sequential(LightConv3x3(mid_channels, mid_channels), LightConv3x3(mid_channels, mid_channels))
289
+ self.conv2c = nn.Sequential(LightConv3x3(mid_channels, mid_channels), LightConv3x3(mid_channels, mid_channels), LightConv3x3(mid_channels, mid_channels))
290
+ self.conv2d = nn.Sequential(LightConv3x3(mid_channels, mid_channels), LightConv3x3(mid_channels, mid_channels), LightConv3x3(mid_channels, mid_channels), LightConv3x3(mid_channels, mid_channels))
291
+ self.gate = ChannelGate(mid_channels)
292
+ self.conv3 = Conv1x1Linear(mid_channels, out_channels)
293
+ self.downsample = Conv1x1Linear(in_channels, out_channels) if in_channels != out_channels else None
294
+ self.IN = nn.InstanceNorm2d(out_channels, affine=True) if IN else None
295
+
296
+ def forward(self, x):
297
+ identity = x
298
+ x1 = self.conv1(x)
299
+ x2 = self.gate(self.conv2a(x1)) + self.gate(self.conv2b(x1)) + self.gate(self.conv2c(x1)) + self.gate(self.conv2d(x1))
300
+ x3 = self.conv3(x2)
301
+ if self.downsample is not None:
302
+ identity = self.downsample(identity)
303
+ out = x3 + identity
304
+ if self.IN is not None:
305
+ out = self.IN(out)
306
+ return F.relu(out)
307
+
308
+
309
+ class OSNetX1(nn.Module):
310
+ def __init__(self, num_classes, blocks, layers, channels, feature_dim=512, loss='softmax', IN=False):
311
+ super().__init__()
312
+ self.loss = loss
313
+ self.feature_dim = feature_dim
314
+ self.conv1 = ConvLayer(3, channels[0], 7, stride=2, padding=3, IN=IN)
315
+ self.maxpool = nn.MaxPool2d(3, stride=2, padding=1)
316
+ self.conv2 = self._make_layer(blocks[0], layers[0], channels[0], channels[1], reduce_spatial_size=True, IN=IN)
317
+ self.conv3 = self._make_layer(blocks[1], layers[1], channels[1], channels[2], reduce_spatial_size=True)
318
+ self.conv4 = self._make_layer(blocks[2], layers[2], channels[2], channels[3], reduce_spatial_size=False)
319
+ self.conv5 = Conv1x1(channels[3], channels[3])
320
+ self.global_avgpool = nn.AdaptiveAvgPool2d(1)
321
+ self.fc = self._construct_fc_layer(feature_dim, channels[3], dropout_p=None)
322
+ self.classifier = nn.Linear(self.feature_dim, num_classes)
323
+ self._init_params()
324
+
325
+ def _make_layer(self, block, layer, in_channels, out_channels, reduce_spatial_size, IN=False):
326
+ layers_list = [block(in_channels, out_channels, IN=IN)]
327
+ for _ in range(1, layer):
328
+ layers_list.append(block(out_channels, out_channels, IN=IN))
329
+ if reduce_spatial_size:
330
+ layers_list.append(nn.Sequential(Conv1x1(out_channels, out_channels), nn.AvgPool2d(2, stride=2)))
331
+ return nn.Sequential(*layers_list)
332
+
333
+ def _construct_fc_layer(self, fc_dims, input_dim, dropout_p=None):
334
+ if fc_dims is None or fc_dims < 0:
335
+ self.feature_dim = input_dim
336
+ return None
337
+ if isinstance(fc_dims, int):
338
+ fc_dims = [fc_dims]
339
+ layers_list = []
340
+ for dim in fc_dims:
341
+ layers_list.append(nn.Linear(input_dim, dim))
342
+ layers_list.append(nn.BatchNorm1d(dim))
343
+ layers_list.append(nn.ReLU(inplace=True))
344
+ if dropout_p is not None:
345
+ layers_list.append(nn.Dropout(p=dropout_p))
346
+ input_dim = dim
347
+ self.feature_dim = fc_dims[-1]
348
+ return nn.Sequential(*layers_list)
349
+
350
+ def _init_params(self):
351
+ for m in self.modules():
352
+ if isinstance(m, nn.Conv2d):
353
+ nn.init.kaiming_normal_(m.weight, mode='fan_out', nonlinearity='relu')
354
+ if m.bias is not None:
355
+ nn.init.constant_(m.bias, 0)
356
+ elif isinstance(m, nn.BatchNorm2d):
357
+ nn.init.constant_(m.weight, 1)
358
+ nn.init.constant_(m.bias, 0)
359
+ elif isinstance(m, nn.BatchNorm1d):
360
+ nn.init.constant_(m.weight, 1)
361
+ nn.init.constant_(m.bias, 0)
362
+ elif isinstance(m, nn.InstanceNorm2d):
363
+ nn.init.constant_(m.weight, 1)
364
+ nn.init.constant_(m.bias, 0)
365
+ elif isinstance(m, nn.Linear):
366
+ nn.init.normal_(m.weight, 0, 0.01)
367
+ if m.bias is not None:
368
+ nn.init.constant_(m.bias, 0)
369
+
370
+ def forward(self, x, return_featuremaps=False):
371
+ x = self.conv1(x)
372
+ x = self.maxpool(x)
373
+ x = self.conv2(x)
374
+ x = self.conv3(x)
375
+ x = self.conv4(x)
376
+ x = self.conv5(x)
377
+ if return_featuremaps:
378
+ return x
379
+ v = self.global_avgpool(x)
380
+ v = v.view(v.size(0), -1)
381
+ if self.fc is not None:
382
+ v = self.fc(v)
383
+ if not self.training:
384
+ return v
385
+ y = self.classifier(v)
386
+ if self.loss == 'softmax':
387
+ return y
388
+ elif self.loss == 'triplet':
389
+ return y, v
390
+ raise KeyError(f"Unsupported loss: {self.loss}")
391
+
392
+
393
+ def osnet_x1_0(num_classes=1000, pretrained=True, loss='softmax', **kwargs):
394
+ return OSNetX1(
395
+ num_classes,
396
+ blocks=[OSBlockX1, OSBlockX1, OSBlockX1],
397
+ layers=[2, 2, 2],
398
+ channels=[64, 256, 384, 512],
399
+ loss=loss,
400
+ **kwargs,
401
+ )
402
+
403
+
404
+ def load_checkpoint_osnet(fpath):
405
+ fpath = os.path.abspath(os.path.expanduser(fpath))
406
+ map_location = None if torch.cuda.is_available() else 'cpu'
407
+ checkpoint = torch.load(fpath, map_location=map_location, weights_only=False)
408
+ return checkpoint
409
+
410
+
411
+ def load_pretrained_weights_osnet(model, weight_path):
412
+ checkpoint = load_checkpoint_osnet(weight_path)
413
+ state_dict = checkpoint.get('state_dict', checkpoint)
414
+ model_dict = model.state_dict()
415
+ new_state_dict = OrderedDict()
416
+ for k, v in state_dict.items():
417
+ if k.startswith('module.'):
418
+ k = k[7:]
419
+ if k in model_dict and model_dict[k].size() == v.size():
420
+ new_state_dict[k] = v
421
+ model_dict.update(new_state_dict)
422
+ model.load_state_dict(model_dict)
423
+
424
+
425
+ def load_osnet(device="cuda", weight_path=None):
426
+ model = osnet_x1_0(num_classes=1, loss='softmax', pretrained=False)
427
+ weight_path = Path(weight_path) if weight_path else None
428
+ if weight_path and weight_path.exists():
429
+ load_pretrained_weights_osnet(model, str(weight_path))
430
+ model.eval()
431
+ model.to(device)
432
+ return model
433
+
434
+
435
+ def _resolve_player_cls_id(model: YOLO, fallback: int = PLAYER_CLS_ID) -> int:
436
+ names = getattr(model, "names", None)
437
+ if not names:
438
+ names = getattr(getattr(model, "model", None), "names", None)
439
+ if isinstance(names, dict):
440
+ for idx, name in names.items():
441
+ if str(name).lower() in ("player", "players"):
442
+ return int(idx)
443
+ if isinstance(names, list):
444
+ for idx, name in enumerate(names):
445
+ if str(name).lower() in ("player", "players"):
446
+ return int(idx)
447
+ return fallback
448
+
449
+
450
+ # ── HRNet architecture ───────────────────────────────────────────
451
+
452
+ BatchNorm2d = nn.BatchNorm2d
453
+ BN_MOMENTUM = 0.1
454
+
455
+ def conv3x3(in_planes, out_planes, stride=1):
456
+ return nn.Conv2d(in_planes, out_planes, kernel_size=3, stride=stride, padding=1, bias=False)
457
+
458
+ class BasicBlock(nn.Module):
459
+ expansion = 1
460
+ def __init__(self, inplanes, planes, stride=1, downsample=None):
461
+ super().__init__()
462
+ self.conv1 = conv3x3(inplanes, planes, stride)
463
+ self.bn1 = BatchNorm2d(planes, momentum=BN_MOMENTUM)
464
+ self.relu = nn.ReLU(inplace=True)
465
+ self.conv2 = conv3x3(planes, planes)
466
+ self.bn2 = BatchNorm2d(planes, momentum=BN_MOMENTUM)
467
+ self.downsample = downsample
468
+ self.stride = stride
469
+
470
+ def forward(self, x):
471
+ residual = x
472
+ out = self.relu(self.bn1(self.conv1(x)))
473
+ out = self.bn2(self.conv2(out))
474
+ if self.downsample is not None:
475
+ residual = self.downsample(x)
476
+ return self.relu(out + residual)
477
+
478
+ class Bottleneck(nn.Module):
479
+ expansion = 4
480
+ def __init__(self, inplanes, planes, stride=1, downsample=None):
481
+ super().__init__()
482
+ self.conv1 = nn.Conv2d(inplanes, planes, kernel_size=1, bias=False)
483
+ self.bn1 = BatchNorm2d(planes, momentum=BN_MOMENTUM)
484
+ self.conv2 = nn.Conv2d(planes, planes, kernel_size=3, stride=stride, padding=1, bias=False)
485
+ self.bn2 = BatchNorm2d(planes, momentum=BN_MOMENTUM)
486
+ self.conv3 = nn.Conv2d(planes, planes * self.expansion, kernel_size=1, bias=False)
487
+ self.bn3 = BatchNorm2d(planes * self.expansion, momentum=BN_MOMENTUM)
488
+ self.relu = nn.ReLU(inplace=True)
489
+ self.downsample = downsample
490
+ self.stride = stride
491
+
492
+ def forward(self, x):
493
+ residual = x
494
+ out = self.relu(self.bn1(self.conv1(x)))
495
+ out = self.relu(self.bn2(self.conv2(out)))
496
+ out = self.bn3(self.conv3(out))
497
+ if self.downsample is not None:
498
+ residual = self.downsample(x)
499
+ return self.relu(out + residual)
500
+
501
+ blocks_dict = {'BASIC': BasicBlock, 'BOTTLENECK': Bottleneck}
502
+
503
+ class HighResolutionModule(nn.Module):
504
+ def __init__(self, num_branches, blocks, num_blocks, num_inchannels,
505
+ num_channels, fuse_method, multi_scale_output=True):
506
+ super().__init__()
507
+ self.num_inchannels = num_inchannels
508
+ self.fuse_method = fuse_method
509
+ self.num_branches = num_branches
510
+ self.multi_scale_output = multi_scale_output
511
+ self.branches = self._make_branches(num_branches, blocks, num_blocks, num_channels)
512
+ self.fuse_layers = self._make_fuse_layers()
513
+ self.relu = nn.ReLU(inplace=True)
514
+
515
+ def _make_one_branch(self, branch_index, block, num_blocks, num_channels, stride=1):
516
+ downsample = None
517
+ if stride != 1 or self.num_inchannels[branch_index] != num_channels[branch_index] * block.expansion:
518
+ downsample = nn.Sequential(
519
+ nn.Conv2d(self.num_inchannels[branch_index], num_channels[branch_index] * block.expansion,
520
+ kernel_size=1, stride=stride, bias=False),
521
+ BatchNorm2d(num_channels[branch_index] * block.expansion, momentum=BN_MOMENTUM),
522
+ )
523
+ layers = [block(self.num_inchannels[branch_index], num_channels[branch_index], stride, downsample)]
524
+ self.num_inchannels[branch_index] = num_channels[branch_index] * block.expansion
525
+ for _ in range(1, num_blocks[branch_index]):
526
+ layers.append(block(self.num_inchannels[branch_index], num_channels[branch_index]))
527
+ return nn.Sequential(*layers)
528
+
529
+ def _make_branches(self, num_branches, block, num_blocks, num_channels):
530
+ return nn.ModuleList([self._make_one_branch(i, block, num_blocks, num_channels) for i in range(num_branches)])
531
+
532
+ def _make_fuse_layers(self):
533
+ if self.num_branches == 1:
534
+ return None
535
+ num_branches = self.num_branches
536
+ num_inchannels = self.num_inchannels
537
+ fuse_layers = []
538
+ for i in range(num_branches if self.multi_scale_output else 1):
539
+ fuse_layer = []
540
+ for j in range(num_branches):
541
+ if j > i:
542
+ fuse_layer.append(nn.Sequential(
543
+ nn.Conv2d(num_inchannels[j], num_inchannels[i], 1, 1, 0, bias=False),
544
+ BatchNorm2d(num_inchannels[i], momentum=BN_MOMENTUM)))
545
+ elif j == i:
546
+ fuse_layer.append(None)
547
+ else:
548
+ conv3x3s = []
549
+ for k in range(i - j):
550
+ if k == i - j - 1:
551
+ conv3x3s.append(nn.Sequential(
552
+ nn.Conv2d(num_inchannels[j], num_inchannels[i], 3, 2, 1, bias=False),
553
+ BatchNorm2d(num_inchannels[i], momentum=BN_MOMENTUM)))
554
+ else:
555
+ conv3x3s.append(nn.Sequential(
556
+ nn.Conv2d(num_inchannels[j], num_inchannels[j], 3, 2, 1, bias=False),
557
+ BatchNorm2d(num_inchannels[j], momentum=BN_MOMENTUM),
558
+ nn.ReLU(inplace=True)))
559
+ fuse_layer.append(nn.Sequential(*conv3x3s))
560
+ fuse_layers.append(nn.ModuleList(fuse_layer))
561
+ return nn.ModuleList(fuse_layers)
562
+
563
+ def get_num_inchannels(self):
564
+ return self.num_inchannels
565
+
566
+ def forward(self, x):
567
+ if self.num_branches == 1:
568
+ return [self.branches[0](x[0])]
569
+ for i in range(self.num_branches):
570
+ x[i] = self.branches[i](x[i])
571
+ x_fuse = []
572
+ for i in range(len(self.fuse_layers)):
573
+ y = x[0] if i == 0 else self.fuse_layers[i][0](x[0])
574
+ for j in range(1, self.num_branches):
575
+ if i == j:
576
+ y = y + x[j]
577
+ elif j > i:
578
+ y = y + F.interpolate(self.fuse_layers[i][j](x[j]),
579
+ size=[x[i].shape[2], x[i].shape[3]], mode='bilinear')
580
+ else:
581
+ y = y + self.fuse_layers[i][j](x[j])
582
+ x_fuse.append(self.relu(y))
583
+ return x_fuse
584
+
585
+ class HighResolutionNet(nn.Module):
586
+ def __init__(self, config, lines=False, **kwargs):
587
+ self.inplanes = 64
588
+ self.lines = lines
589
+ extra = config['MODEL']['EXTRA']
590
+ super().__init__()
591
+ self.conv1 = nn.Conv2d(3, 64, kernel_size=3, stride=2, padding=1, bias=False)
592
+ self.bn1 = BatchNorm2d(64, momentum=BN_MOMENTUM)
593
+ self.conv2 = nn.Conv2d(64, 64, kernel_size=3, stride=2, padding=1, bias=False)
594
+ self.bn2 = BatchNorm2d(64, momentum=BN_MOMENTUM)
595
+ self.relu = nn.ReLU(inplace=True)
596
+ self.layer1 = self._make_layer(Bottleneck, 64, 64, 4)
597
+
598
+ self.stage2_cfg = extra['STAGE2']
599
+ num_channels = self.stage2_cfg['NUM_CHANNELS']
600
+ block = blocks_dict[self.stage2_cfg['BLOCK']]
601
+ num_channels = [num_channels[i] * block.expansion for i in range(len(num_channels))]
602
+ self.transition1 = self._make_transition_layer([256], num_channels)
603
+ self.stage2, pre_stage_channels = self._make_stage(self.stage2_cfg, num_channels)
604
+
605
+ self.stage3_cfg = extra['STAGE3']
606
+ num_channels = self.stage3_cfg['NUM_CHANNELS']
607
+ block = blocks_dict[self.stage3_cfg['BLOCK']]
608
+ num_channels = [num_channels[i] * block.expansion for i in range(len(num_channels))]
609
+ self.transition2 = self._make_transition_layer(pre_stage_channels, num_channels)
610
+ self.stage3, pre_stage_channels = self._make_stage(self.stage3_cfg, num_channels)
611
+
612
+ self.stage4_cfg = extra['STAGE4']
613
+ num_channels = self.stage4_cfg['NUM_CHANNELS']
614
+ block = blocks_dict[self.stage4_cfg['BLOCK']]
615
+ num_channels = [num_channels[i] * block.expansion for i in range(len(num_channels))]
616
+ self.transition3 = self._make_transition_layer(pre_stage_channels, num_channels)
617
+ self.stage4, pre_stage_channels = self._make_stage(self.stage4_cfg, num_channels, multi_scale_output=True)
618
+
619
+ self.upsample = nn.Upsample(scale_factor=2, mode='nearest')
620
+ final_inp_channels = sum(pre_stage_channels) + self.inplanes
621
+ self.head = nn.Sequential(nn.Sequential(
622
+ nn.Conv2d(final_inp_channels, final_inp_channels, kernel_size=1),
623
+ BatchNorm2d(final_inp_channels, momentum=BN_MOMENTUM),
624
+ nn.ReLU(inplace=True),
625
+ nn.Conv2d(final_inp_channels, config['MODEL']['NUM_JOINTS'], kernel_size=extra['FINAL_CONV_KERNEL']),
626
+ nn.Softmax(dim=1) if not self.lines else nn.Sigmoid()))
627
+
628
+ def _make_head(self, x, x_skip):
629
+ x = self.upsample(x)
630
+ x = torch.cat([x, x_skip], dim=1)
631
+ return self.head(x)
632
+
633
+ def _make_transition_layer(self, num_channels_pre_layer, num_channels_cur_layer):
634
+ num_branches_cur = len(num_channels_cur_layer)
635
+ num_branches_pre = len(num_channels_pre_layer)
636
+ transition_layers = []
637
+ for i in range(num_branches_cur):
638
+ if i < num_branches_pre:
639
+ if num_channels_cur_layer[i] != num_channels_pre_layer[i]:
640
+ transition_layers.append(nn.Sequential(
641
+ nn.Conv2d(num_channels_pre_layer[i], num_channels_cur_layer[i], 3, 1, 1, bias=False),
642
+ BatchNorm2d(num_channels_cur_layer[i], momentum=BN_MOMENTUM),
643
+ nn.ReLU(inplace=True)))
644
+ else:
645
+ transition_layers.append(None)
646
+ else:
647
+ conv3x3s = []
648
+ for j in range(i + 1 - num_branches_pre):
649
+ inchannels = num_channels_pre_layer[-1]
650
+ outchannels = num_channels_cur_layer[i] if j == i - num_branches_pre else inchannels
651
+ conv3x3s.append(nn.Sequential(
652
+ nn.Conv2d(inchannels, outchannels, 3, 2, 1, bias=False),
653
+ BatchNorm2d(outchannels, momentum=BN_MOMENTUM),
654
+ nn.ReLU(inplace=True)))
655
+ transition_layers.append(nn.Sequential(*conv3x3s))
656
+ return nn.ModuleList(transition_layers)
657
+
658
+ def _make_layer(self, block, inplanes, planes, blocks, stride=1):
659
+ downsample = None
660
+ if stride != 1 or inplanes != planes * block.expansion:
661
+ downsample = nn.Sequential(
662
+ nn.Conv2d(inplanes, planes * block.expansion, kernel_size=1, stride=stride, bias=False),
663
+ BatchNorm2d(planes * block.expansion, momentum=BN_MOMENTUM),
664
+ )
665
+ layers = [block(inplanes, planes, stride, downsample)]
666
+ inplanes = planes * block.expansion
667
+ for _ in range(1, blocks):
668
+ layers.append(block(inplanes, planes))
669
+ return nn.Sequential(*layers)
670
+
671
+ def _make_stage(self, layer_config, num_inchannels, multi_scale_output=True):
672
+ num_modules = layer_config['NUM_MODULES']
673
+ num_branches = layer_config['NUM_BRANCHES']
674
+ num_blocks = layer_config['NUM_BLOCKS']
675
+ num_channels = layer_config['NUM_CHANNELS']
676
+ block = blocks_dict[layer_config['BLOCK']]
677
+ fuse_method = layer_config['FUSE_METHOD']
678
+ modules = []
679
+ for i in range(num_modules):
680
+ reset_multi_scale_output = True if multi_scale_output or i < num_modules - 1 else False
681
+ modules.append(HighResolutionModule(
682
+ num_branches, block, num_blocks, num_inchannels,
683
+ num_channels, fuse_method, reset_multi_scale_output))
684
+ num_inchannels = modules[-1].get_num_inchannels()
685
+ return nn.Sequential(*modules), num_inchannels
686
+
687
+ def forward(self, x):
688
+ x = self.conv1(x)
689
+ x_skip = x.clone()
690
+ x = self.relu(self.bn1(x))
691
+ x = self.relu(self.bn2(self.conv2(x)))
692
+ x = self.layer1(x)
693
+
694
+ x_list = []
695
+ for i in range(self.stage2_cfg['NUM_BRANCHES']):
696
+ x_list.append(self.transition1[i](x) if self.transition1[i] is not None else x)
697
+ y_list = self.stage2(x_list)
698
+
699
+ x_list = []
700
+ for i in range(self.stage3_cfg['NUM_BRANCHES']):
701
+ x_list.append(self.transition2[i](y_list[-1]) if self.transition2[i] is not None else y_list[i])
702
+ y_list = self.stage3(x_list)
703
+
704
+ x_list = []
705
+ for i in range(self.stage4_cfg['NUM_BRANCHES']):
706
+ x_list.append(self.transition3[i](y_list[-1]) if self.transition3[i] is not None else y_list[i])
707
+ x = self.stage4(x_list)
708
+
709
+ height, width = x[0].size(2), x[0].size(3)
710
+ x1 = F.interpolate(x[1], size=(height, width), mode='bilinear', align_corners=False)
711
+ x2 = F.interpolate(x[2], size=(height, width), mode='bilinear', align_corners=False)
712
+ x3 = F.interpolate(x[3], size=(height, width), mode='bilinear', align_corners=False)
713
+ x = torch.cat([x[0], x1, x2, x3], 1)
714
+ return self._make_head(x, x_skip)
715
+
716
+ def init_weights(self, pretrained=''):
717
+ for m in self.modules():
718
+ if isinstance(m, nn.Conv2d):
719
+ nn.init.kaiming_normal_(m.weight, mode='fan_out', nonlinearity='relu')
720
+ elif isinstance(m, nn.BatchNorm2d):
721
+ nn.init.constant_(m.weight, 1)
722
+ nn.init.constant_(m.bias, 0)
723
+ if pretrained:
724
+ if os.path.isfile(pretrained):
725
+ pretrained_dict = torch.load(pretrained)
726
+ model_dict = self.state_dict()
727
+ pretrained_dict = {k: v for k, v in pretrained_dict.items() if k in model_dict}
728
+ model_dict.update(pretrained_dict)
729
+ self.load_state_dict(model_dict)
730
+ else:
731
+ sys.exit(f'Weights {pretrained} not found.')
732
+
733
+ def get_cls_net(config, pretrained='', **kwargs):
734
+ model = HighResolutionNet(config, **kwargs)
735
+ model.init_weights(pretrained)
736
+ return model
737
+
738
+
739
+ # ── Keypoint mapping & inference helpers ─────────────────────────
740
+
741
+ map_keypoints = {
742
+ 1: 1, 2: 14, 3: 25, 4: 2, 5: 10, 6: 18, 7: 26, 8: 3, 9: 7, 10: 23,
743
+ 11: 27, 20: 4, 21: 8, 22: 24, 23: 28, 24: 5, 25: 13, 26: 21, 27: 29,
744
+ 28: 6, 29: 17, 30: 30, 31: 11, 32: 15, 33: 19, 34: 12, 35: 16, 36: 20,
745
+ 45: 9, 50: 31, 52: 32, 57: 22
746
+ }
747
+
748
+ # Template keypoints for homography refinement (new-5 style)
749
+ TEMPLATE_F0: List[Tuple[float, float]] = [
750
+ (5, 5), (5, 140), (5, 250), (5, 430), (5, 540), (5, 675), (55, 250), (55, 430),
751
+ (110, 340), (165, 140), (165, 270), (165, 410), (165, 540), (527, 5), (527, 253),
752
+ (527, 433), (527, 675), (888, 140), (888, 270), (888, 410), (888, 540), (940, 340),
753
+ (998, 250), (998, 430), (1045, 5), (1045, 140), (1045, 250), (1045, 430), (1045, 540),
754
+ (1045, 675), (435, 340), (615, 340),
755
+ ]
756
+ TEMPLATE_F1: List[Tuple[float, float]] = [
757
+ (2.5, 2.5), (2.5, 139.5), (2.5, 249.5), (2.5, 430.5), (2.5, 540.5), (2.5, 678),
758
+ (54.5, 249.5), (54.5, 430.5), (110.5, 340.5), (164.5, 139.5), (164.5, 269), (164.5, 411),
759
+ (164.5, 540.5), (525, 2.5), (525, 249.5), (525, 430.5), (525, 678), (886.5, 139.5),
760
+ (886.5, 269), (886.5, 411), (886.5, 540.5), (940.5, 340.5), (998, 249.5), (998, 430.5),
761
+ (1048, 2.5), (1048, 139.5), (1048, 249.5), (1048, 430.5), (1048, 540.5), (1048, 678),
762
+ (434.5, 340), (615.5, 340),
763
+ ]
764
+ HOMOGRAPHY_FILL_ONLY_VALID = True
765
+ KP_THRESHOLD = 0.2 # new-5 style (was 0.3)
766
+ HRNET_BATCH_SIZE = 4 # larger batch = faster (if GPU mem allows)
767
+
768
+
769
+ def _preprocess_batch(frames):
770
+ target_h, target_w = 540, 960
771
+ batch = []
772
+ for frame in frames:
773
+ img = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)
774
+ img = cv2.resize(img, (target_w, target_h)).astype(np.float32) / 255.0
775
+ batch.append(np.transpose(img, (2, 0, 1)))
776
+ return torch.from_numpy(np.stack(batch)).float()
777
+
778
+
779
+ def _extract_keypoints(heatmap: torch.Tensor, scale: int = 2):
780
+ b, c, h, w = heatmap.shape
781
+ max_pooled = F.max_pool2d(heatmap, 3, stride=1, padding=1)
782
+ local_maxima = (max_pooled == heatmap)
783
+ masked = heatmap * local_maxima
784
+ flat = masked.view(b, c, -1)
785
+ scores, indices = torch.topk(flat, 1, dim=-1, sorted=False)
786
+ y_coords = torch.div(indices, w, rounding_mode="floor") * scale
787
+ x_coords = (indices % w) * scale
788
+ return torch.stack([x_coords.float(), y_coords.float(), scores], dim=-1)
789
+
790
+
791
+ def _process_keypoints(kp_coords, threshold, w, h, batch_size):
792
+ kp_np = kp_coords.cpu().numpy()
793
+ results = []
794
+ for b_idx in range(batch_size):
795
+ kp_dict = {}
796
+ valid = np.where(kp_np[b_idx, :, 0, 2] > threshold)[0]
797
+ for ch_idx in valid:
798
+ kp_dict[ch_idx + 1] = {
799
+ 'x': float(kp_np[b_idx, ch_idx, 0, 0]) / w,
800
+ 'y': float(kp_np[b_idx, ch_idx, 0, 1]) / h,
801
+ 'p': float(kp_np[b_idx, ch_idx, 0, 2]),
802
+ }
803
+ results.append(kp_dict)
804
+ return results
805
+
806
+
807
+ def _run_hrnet_batch(frames, model, threshold, batch_size=8):
808
+ if not frames or model is None:
809
+ return []
810
+ device = next(model.parameters()).device
811
+ results = []
812
+ for i in range(0, len(frames), batch_size):
813
+ chunk = frames[i:i + batch_size]
814
+ batch = _preprocess_batch(chunk).to(device)
815
+ with torch.no_grad():
816
+ heatmaps = model(batch)
817
+ kp_coords = _extract_keypoints(heatmaps[:, :-1, :, :], scale=2)
818
+ batch_kps = _process_keypoints(kp_coords, threshold, 960, 540, len(chunk))
819
+ results.extend(batch_kps)
820
+ del heatmaps, kp_coords, batch
821
+ gc.collect()
822
+ return results
823
+
824
+
825
+ def _apply_keypoint_mapping(kp_dict):
826
+ return {map_keypoints[k]: v for k, v in kp_dict.items() if k in map_keypoints}
827
+
828
+
829
+ def _normalize_keypoints(kp_results, frames, n_keypoints):
830
+ keypoints = []
831
+ max_frames = min(len(kp_results), len(frames))
832
+ for i in range(max_frames):
833
+ kp_dict = kp_results[i]
834
+ h, w = frames[i].shape[:2]
835
+ frame_kps = []
836
+ for idx in range(n_keypoints):
837
+ kp_idx = idx + 1
838
+ x, y = 0, 0
839
+ if kp_idx in kp_dict:
840
+ d = kp_dict[kp_idx]
841
+ if isinstance(d, dict) and 'x' in d:
842
+ x = int(d['x'] * w)
843
+ y = int(d['y'] * h)
844
+ frame_kps.append((x, y))
845
+ keypoints.append(frame_kps)
846
+ return keypoints
847
+
848
+
849
+ def _fix_keypoints(kps: list, n: int) -> list:
850
+ if len(kps) < n:
851
+ kps += [(0, 0)] * (n - len(kps))
852
+ elif len(kps) > n:
853
+ kps = kps[:n]
854
+
855
+ if kps[2] != (0,0) and kps[4] != (0,0) and kps[3] == (0,0):
856
+ kps[3] = kps[4]; kps[4] = (0,0)
857
+ if kps[0] != (0,0) and kps[4] != (0,0) and kps[1] == (0,0):
858
+ kps[1] = kps[4]; kps[4] = (0,0)
859
+ if kps[2] != (0,0) and kps[3] != (0,0) and kps[1] == (0,0) and kps[3][0] > kps[2][0]:
860
+ kps[1] = kps[3]; kps[3] = (0,0)
861
+ if kps[28] != (0,0) and kps[25] == (0,0) and kps[26] != (0,0) and kps[26][0] > kps[28][0]:
862
+ kps[25] = kps[28]; kps[28] = (0,0)
863
+ if kps[24] != (0,0) and kps[28] != (0,0) and kps[25] == (0,0):
864
+ kps[25] = kps[28]; kps[28] = (0,0)
865
+ if kps[24] != (0,0) and kps[27] != (0,0) and kps[26] == (0,0):
866
+ kps[26] = kps[27]; kps[27] = (0,0)
867
+ if kps[28] != (0,0) and kps[23] == (0,0) and kps[20] != (0,0) and kps[20][1] > kps[23][1]:
868
+ kps[23] = kps[20]; kps[20] = (0,0)
869
+ return kps
870
+
871
+
872
+ def _keypoints_to_float(keypoints: list) -> List[List[float]]:
873
+ """Convert keypoints to [[x, y], ...] float format for homography."""
874
+ return [[float(x), float(y)] for x, y in keypoints]
875
+
876
+
877
+ def _keypoints_to_int(keypoints: list) -> List[Tuple[int, int]]:
878
+ """Convert keypoints to [(x, y), ...] integer format."""
879
+ return [(int(round(float(kp[0]))), int(round(float(kp[1])))) for kp in keypoints]
880
+
881
+
882
+ def _apply_homography_refinement(
883
+ keypoints: List[List[float]],
884
+ frame: np.ndarray,
885
+ n_keypoints: int,
886
+ ) -> List[List[float]]:
887
+ """Refine keypoints using homography from template to frame (new-5 style)."""
888
+ if n_keypoints != 32 or len(TEMPLATE_F0) != 32 or len(TEMPLATE_F1) != 32:
889
+ return keypoints
890
+ frame_height, frame_width = frame.shape[:2]
891
+ valid_src: List[Tuple[float, float]] = []
892
+ valid_dst: List[Tuple[float, float]] = []
893
+ valid_indices: List[int] = []
894
+ for kp_idx, kp in enumerate(keypoints):
895
+ if kp and len(kp) >= 2:
896
+ x, y = float(kp[0]), float(kp[1])
897
+ if not (abs(x) < 1e-6 and abs(y) < 1e-6) and 0 <= x < frame_width and 0 <= y < frame_height:
898
+ valid_src.append(TEMPLATE_F1[kp_idx])
899
+ valid_dst.append((x, y))
900
+ valid_indices.append(kp_idx)
901
+ if len(valid_src) < 4:
902
+ return keypoints
903
+ src_pts = np.array(valid_src, dtype=np.float32)
904
+ dst_pts = np.array(valid_dst, dtype=np.float32)
905
+ H, _ = cv2.findHomography(src_pts, dst_pts)
906
+ if H is None:
907
+ return keypoints
908
+ all_template_points = np.array(TEMPLATE_F0, dtype=np.float32).reshape(-1, 1, 2)
909
+ adjusted_points = cv2.perspectiveTransform(all_template_points, H)
910
+ adjusted_points = adjusted_points.reshape(-1, 2)
911
+ adj_x = adjusted_points[:32, 0]
912
+ adj_y = adjusted_points[:32, 1]
913
+ valid_mask = (adj_x >= 0) & (adj_y >= 0) & (adj_x < frame_width) & (adj_y < frame_height)
914
+ valid_indices_set = set(valid_indices)
915
+ adjusted_kps: List[List[float]] = [[0.0, 0.0] for _ in range(32)]
916
+ for i in np.where(valid_mask)[0]:
917
+ if not HOMOGRAPHY_FILL_ONLY_VALID or i in valid_indices_set:
918
+ adjusted_kps[i] = [float(adj_x[i]), float(adj_y[i])]
919
+ return adjusted_kps
920
+
921
+
922
+ # ── Pydantic models ───────────────────────────────────────────────────────────
923
+
924
+ # Team assignment: 6 = team 1, 7 = team 2
925
+ TEAM_1_ID = 6
926
+ TEAM_2_ID = 7
927
+ PLAYER_CLS_ID = 2
928
+
929
+
930
+ class BoundingBox(BaseModel):
931
+ x1: int
932
+ y1: int
933
+ x2: int
934
+ y2: int
935
+ cls_id: int
936
+ conf: float
937
+ track_id: Optional[int] = None
938
+
939
+ class TVFrameResult(BaseModel):
940
+ frame_id: int
941
+ boxes: list[BoundingBox]
942
+ keypoints: List[Tuple[float, float]] # [(x, y), ...] float coordinates
943
+
944
+
945
+ # ── Miner ─────────────────────────────────────────────────────────────────────
946
+
947
+ class Miner:
948
+ def __init__(self, path_hf_repo: Path) -> None:
949
+ self.path_hf_repo = path_hf_repo
950
+ self.is_start = False
951
+ self._executor = ThreadPoolExecutor(max_workers=2)
952
+
953
+ global _OSNET_MODEL, osnet_weight_path
954
+ device = "cuda" if torch.cuda.is_available() else "cpu"
955
+ self.device = device
956
+
957
+ # BBox model
958
+ bbox_file = "player_detect.pt"
959
+ self.bbox_model = YOLO(Path(bbox_file) if Path(bbox_file).exists() else path_hf_repo / bbox_file)
960
+ print("βœ… BBox Model Loaded")
961
+ global PLAYER_CLS_ID
962
+ PLAYER_CLS_ID = _resolve_player_cls_id(self.bbox_model, PLAYER_CLS_ID)
963
+
964
+ # OSNet team classifier
965
+ osnet_weight_path = path_hf_repo / "osnet_model.pth.tar-100"
966
+ if osnet_weight_path.exists():
967
+ _OSNET_MODEL = load_osnet(device, osnet_weight_path)
968
+ print("βœ… Team Classifier Loaded (OSNet)")
969
+ else:
970
+ _OSNET_MODEL = None
971
+ print(f"⚠️ OSNet weights not found at {osnet_weight_path}. Using HSV fallback.")
972
+
973
+ # Keypoints model: HRNet
974
+ kp_config_file = "hrnetv2_w48.yaml"
975
+ kp_weights_file = "keypoint_detect.pt"
976
+ config_path = Path(kp_config_file) if Path(kp_config_file).exists() else path_hf_repo / kp_config_file
977
+ weights_path = Path(kp_weights_file) if Path(kp_weights_file).exists() else path_hf_repo / kp_weights_file
978
+ cfg = yaml.safe_load(open(config_path, 'r'))
979
+ hrnet = get_cls_net(cfg)
980
+ state = torch.load(weights_path, map_location=device, weights_only=False)
981
+ hrnet.load_state_dict(state)
982
+ hrnet.to(device).eval()
983
+ self.keypoints_model = hrnet
984
+ print("βœ… HRNet Keypoints Model Loaded")
985
+
986
+ def __repr__(self) -> str:
987
+ return (
988
+ f"BBox Model: {type(self.bbox_model).__name__}\n"
989
+ f"Keypoints Model: {type(self.keypoints_model).__name__}\n"
990
+ f"Team Clustering: OSNet + KMeans"
991
+ )
992
+
993
+ def _bbox_task(self, images: list[ndarray]) -> list[list[BoundingBox]]:
994
+ """Batch YOLO inference + team assignment."""
995
+ if not images:
996
+ return []
997
+ if self.bbox_model is None:
998
+ return [[] for _ in images]
999
+ try:
1000
+ bbox_results = self.bbox_model(images, conf=0.2, iou=0.5, agnostic_nms=True, verbose=False)
1001
+ except Exception:
1002
+ return [[] for _ in images]
1003
+ bboxes_by_frame: Dict[int, List[BoundingBox]] = {}
1004
+ track_id = 0
1005
+ for frame_idx, bbox_result in enumerate(bbox_results):
1006
+ boxes = []
1007
+ if bbox_result and bbox_result.boxes is not None and len(bbox_result.boxes) > 0:
1008
+ for box in bbox_result.boxes:
1009
+ x1, y1, x2, y2 = map(int, box.xyxy[0].cpu().numpy())
1010
+ conf = float(box.conf.cpu().numpy()[0])
1011
+ cls_id = int(box.cls.cpu().numpy()[0])
1012
+ tid = None
1013
+ if cls_id == PLAYER_CLS_ID:
1014
+ track_id += 1
1015
+ tid = track_id
1016
+ boxes.append(BoundingBox(x1=x1, y1=y1, x2=x2, y2=y2, cls_id=cls_id, conf=conf, track_id=tid))
1017
+ bboxes_by_frame[frame_idx] = boxes
1018
+
1019
+ try:
1020
+ _classify_teams_batch(images, bboxes_by_frame, self.device)
1021
+ except Exception:
1022
+ pass
1023
+ return [bboxes_by_frame[i] for i in range(len(images))]
1024
+
1025
+ def _keypoint_task(self, images: list[ndarray], n_keypoints: int) -> list[list]:
1026
+ """HRNet keypoints + homography refinement."""
1027
+ if not images:
1028
+ return []
1029
+ if self.keypoints_model is None:
1030
+ return [[(0, 0)] * n_keypoints for _ in images]
1031
+ try:
1032
+ raw_kps = _run_hrnet_batch(images, self.keypoints_model, KP_THRESHOLD, batch_size=HRNET_BATCH_SIZE)
1033
+ except Exception:
1034
+ return [[(0, 0)] * n_keypoints for _ in images]
1035
+ raw_kps = [_apply_keypoint_mapping(kp) for kp in raw_kps] if raw_kps else []
1036
+ keypoints = _normalize_keypoints(raw_kps, images, n_keypoints) if raw_kps else [[(0, 0)] * n_keypoints for _ in images]
1037
+ keypoints = [_fix_keypoints(kps, n_keypoints) for kps in keypoints]
1038
+ keypoints = [_keypoints_to_float(kps) for kps in keypoints]
1039
+ if n_keypoints == 32 and len(TEMPLATE_F0) == 32 and len(TEMPLATE_F1) == 32:
1040
+ for idx in range(len(images)):
1041
+ try:
1042
+ keypoints[idx] = _apply_homography_refinement(keypoints[idx], images[idx], n_keypoints)
1043
+ except Exception:
1044
+ pass
1045
+ # keypoints = [_keypoints_to_int(kps) for kps in keypoints]
1046
+ return keypoints
1047
+
1048
+ def predict_batch(
1049
+ self,
1050
+ batch_images: list[ndarray],
1051
+ offset: int,
1052
+ n_keypoints: int,
1053
+ ) -> list[TVFrameResult]:
1054
+
1055
+ if not self.is_start:
1056
+ self.is_start = True
1057
+
1058
+ images = list(batch_images)
1059
+ if offset == 0:
1060
+ gc.collect()
1061
+ if torch.cuda.is_available():
1062
+ torch.cuda.empty_cache()
1063
+
1064
+ # Run bbox (batched YOLO) and keypoints in parallel
1065
+ future_bbox = self._executor.submit(self._bbox_task, images)
1066
+ future_kp = self._executor.submit(self._keypoint_task, images, n_keypoints)
1067
+ bbox_per_frame = future_bbox.result()
1068
+ keypoints = future_kp.result()
1069
+
1070
+ return [
1071
+ TVFrameResult(frame_id=offset + i, boxes=bbox_per_frame[i], keypoints=keypoints[i])
1072
+ for i in range(len(images))
1073
+ ]
osnet_model.pth.tar-100 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:45e1de9d329b534c16f450d99a898c516f8b237dcea471053242c2d4c76b4ace
3
+ size 26846063
player_detect.pt ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:934be460f78c594cc98078027f280c23385c9897e3e761e438559b3193233b46
3
+ size 19209626