thxplz commited on
Commit
df041e6
·
verified ·
1 Parent(s): 6247608

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +186 -0
README.md CHANGED
@@ -1,3 +1,189 @@
1
  ---
2
  license: apache-2.0
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: apache-2.0
3
  ---
4
+
5
+ # Hybrid-SOV
6
+
7
+ Bridging Detection Architectures With Foundation Models: A Unified Framework for Human-Object Interaction Detection.
8
+
9
+ [[Paper]](https://ieeexplore.ieee.org/document/11367687)
10
+
11
+ ![Hybrid-SOV-VLA](https://cdn-uploads.huggingface.co/production/uploads/63119ce2fb65b9a3e2f75e3c/86c5_6jsowjvzTHFr67KM.jpeg)
12
+
13
+ ## Requirements
14
+
15
+ ```bash
16
+ conda create -n hybrid-sov python=3.11 -y
17
+ conda activate hybrid-sov
18
+ pip install uv
19
+
20
+ # Install PyTorch/torchvision for your CUDA version first.
21
+ uv pip install torch torchvision
22
+
23
+ uv pip install numpy scipy pillow tqdm pyyaml pycocotools tabulate addict yapf loguru
24
+ uv pip install timm transformers fairscale omegaconf wandb
25
+ uv pip install git+https://github.com/openai/CLIP.git
26
+ uv pip install -U huggingface_hub
27
+ ```
28
+
29
+ ## Dataset Preparation
30
+
31
+ ### HICO-DET
32
+
33
+ Please follow the HICO-DET dataset preparation of [GGNet](https://github.com/SherlockHolmes221/GGNet). See also the README of [QAHOI](https://github.com/cjw2021/QAHOI).
34
+
35
+ After preparation, the `data/hico_det` folder should look like:
36
+
37
+ ```bash
38
+ data
39
+ +-- hico_det
40
+ | +-- images
41
+ | | +-- test2015
42
+ | | +-- train2015
43
+ | +-- annotations
44
+ | +-- anno_list.json
45
+ | +-- corre_hico.npy
46
+ | +-- file_name_to_obj_cat.json
47
+ | +-- hoi_id_to_num.json
48
+ | +-- hoi_list_new.json
49
+ | +-- test_hico.json
50
+ | +-- trainval_hico.json
51
+ ```
52
+
53
+ ### V-COCO
54
+
55
+ Please follow the installation of [V-COCO](https://github.com/s-gupta/v-coco).
56
+
57
+ For evaluation, please put `vcoco_test.ids` and `vcoco_test.json` into the `data/v-coco/data` folder.
58
+
59
+ After preparation, the `data/v-coco` folder should look like:
60
+
61
+ ```bash
62
+ data
63
+ +-- v-coco
64
+ | +-- prior.pickle
65
+ | +-- images
66
+ | | +-- train2014
67
+ | | +-- val2014
68
+ | +-- data
69
+ | | +-- instances_vcoco_all_2014.json
70
+ | | +-- vcoco_test.ids
71
+ | | +-- vcoco_test.json
72
+ | +-- annotations
73
+ | +-- corre_vcoco.npy
74
+ | +-- test_vcoco.json
75
+ | +-- trainval_vcoco.json
76
+ ```
77
+
78
+ ## Model Weights
79
+
80
+ The checkpoint files are under the `params/` folder in the Hugging Face repository.
81
+
82
+ Download the released weights into the local `params` folder:
83
+
84
+ ```bash
85
+ huggingface-cli download thxplz/Hybrid-SOV \
86
+ --include "params/*" \
87
+ --local-dir .
88
+ ```
89
+
90
+ Expected files include:
91
+
92
+ ```bash
93
+ params
94
+ +-- hico_det_hybrid-sov-r50.pth
95
+ +-- hico_det_hybrid-sov-vla-r50.pth
96
+ +-- rtdetr_r50vd_6x_coco_from_paddle_converted_hico.pth
97
+ +-- rtdetr_r50vd_6x_coco_from_paddle_converted_vcoco.pth
98
+ +-- vcoco-hybrid-sov-vla-r50.pth
99
+ ```
100
+
101
+ ## Evaluation
102
+
103
+ ### HICO-DET
104
+
105
+ | Model | Full (def) | Rare (def) | None-Rare (def) | Full (ko) | Rare (ko) | None-Rare (ko) | ckpt |
106
+ |:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|
107
+ | Hybrid-SOV-R50 | 35.58 | 31.65 | 36.76 | 39.04 | 35.36 | 40.13 | [checkpoint](https://huggingface.co/thxplz/Hybrid-SOV/blob/main/params/hico_det_hybrid-sov-r50.pth) |
108
+ | Hybrid-SOV-VLA-R50 | 43.10 | 43.04 | 43.12 | 46.02 | 46.14 | 45.98 | [checkpoint](https://huggingface.co/thxplz/Hybrid-SOV/blob/main/params/hico_det_hybrid-sov-vla-r50.pth) |
109
+ | Hybrid-SOV-VLA-DINOv3-CNX-L | 46.89 | 47.67 | 46.66 | 49.16 | 50.17 | 48.86 | [checkpoint](https://huggingface.co/thxplz/Hybrid-SOV/blob/main/params/hico_det_hybrid-sov-vla-dinov3-cnx-l.pth) |
110
+
111
+
112
+ Evaluate the released checkpoints by running:
113
+
114
+ ```bash
115
+ # Hybrid-SOV-R50 (HICO-DET)
116
+ sh run/hybrid-sov-r50_eval.sh
117
+
118
+ # Hybrid-SOV-VLA-R50 (HICO-DET)
119
+ sh run/hybrid-sov-vla-r50_eval.sh
120
+ ```
121
+
122
+ ### V-COCO
123
+
124
+ | Model | AP (S1) | AP (S2) | ckpt |
125
+ |:---:|:---:|:---:|:---:|
126
+ | Hybrid-SOV-VLA-R50 | 67.9 | 70.1 | [checkpoint](https://huggingface.co/thxplz/Hybrid-SOV/blob/main/params/vcoco-hybrid-sov-vla-r50.pth) |
127
+
128
+ Evaluate the released checkpoint by running:
129
+
130
+ ```bash
131
+ # Hybrid-SOV-VLA-R50 (V-COCO)
132
+ sh run/vcoco-hybrid-sov-vla-r50_eval.sh
133
+ ```
134
+
135
+ ## Training
136
+
137
+ ### HICO-DET
138
+
139
+ Download the RT-DETR R50 HICO pre-trained weight from [thxplz/Hybrid-SOV](https://huggingface.co/thxplz/Hybrid-SOV/tree/main/params) into `params`:
140
+
141
+ ```bash
142
+ params/rtdetr_r50vd_6x_coco_from_paddle_converted_hico.pth
143
+ ```
144
+
145
+ Train Hybrid-SOV-R50:
146
+
147
+ ```bash
148
+ sh run/hybrid-sov-r50.sh
149
+ ```
150
+
151
+ Train Hybrid-SOV-VLA-R50:
152
+
153
+ ```bash
154
+ sh run/hybrid-sov-vla-r50.sh
155
+ ```
156
+
157
+ Train Hybrid-SOV-VLA with DINOv3/ConvNeXt-L:
158
+
159
+ ```bash
160
+ sh run/hybrid-sov-vla-dinov3-cnx-l.sh
161
+ ```
162
+
163
+ ### V-COCO
164
+
165
+ Download the RT-DETR R50 V-COCO pre-trained weight from [thxplz/Hybrid-SOV](https://huggingface.co/thxplz/Hybrid-SOV/tree/main/params) into `params`:
166
+
167
+ ```bash
168
+ params/rtdetr_r50vd_6x_coco_from_paddle_converted_vcoco.pth
169
+ ```
170
+
171
+ Train Hybrid-SOV-VLA-R50 on V-COCO:
172
+
173
+ ```bash
174
+ sh run/vcoco-hybrid-sov-vla-r50_hoi.sh
175
+ ```
176
+
177
+ ## References
178
+
179
+ ```txt
180
+ @ARTICLE{chen2026hybridsov,
181
+ author={Chen, Junwen and Yanai, Keiji},
182
+ journal={IEEE Access},
183
+ title={Bridging Detection Architectures With Foundation Models: A Unified Framework for Human-Object Interaction Detection},
184
+ year={2026},
185
+ volume={14},
186
+ pages={23299-23310},
187
+ url={https://ieeexplore.ieee.org/document/11367687}
188
+ }
189
+ ```