maelic commited on
Commit
12aa8da
·
verified ·
1 Parent(s): 7a52bd2

Update model card

Browse files
Files changed (1) hide show
  1. README.md +146 -98
README.md CHANGED
@@ -19,7 +19,37 @@ model-index:
19
  dataset:
20
  name: PSG
21
  type: psg
22
- metrics: []
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
23
  - name: REACT++ yolo12s
24
  results:
25
  - task:
@@ -30,35 +60,35 @@ model-index:
30
  type: psg
31
  metrics:
32
  - type: mR@20
33
- value: 2.91
34
  name: mR@20
35
  - type: R@20
36
- value: 6.71
37
  name: R@20
38
- - type: zsR@20
39
- value: 1.82
40
- name: zsR@20
41
  - type: mR@50
42
- value: 3.93
43
  name: mR@50
44
  - type: R@50
45
- value: 9.28
46
  name: R@50
47
- - type: zsR@50
48
- value: 2.66
49
- name: zsR@50
50
  - type: mR@100
51
- value: 4.62
52
  name: mR@100
53
  - type: R@100
54
- value: 11.21
55
  name: R@100
56
- - type: zsR@100
57
- value: 3.22
58
- name: zsR@100
59
- - type: mean_recall
60
- value: 24.71
61
- name: mean_recall
62
  - name: REACT++ yolo12m
63
  results:
64
  - task:
@@ -69,35 +99,35 @@ model-index:
69
  type: psg
70
  metrics:
71
  - type: mR@20
72
- value: 22.73
73
  name: mR@20
74
  - type: R@20
75
- value: 31.11
76
  name: R@20
77
- - type: zsR@20
78
- value: 1.81
79
- name: zsR@20
80
  - type: mR@50
81
- value: 25.75
82
  name: mR@50
83
  - type: R@50
84
- value: 36.29
85
  name: R@50
86
- - type: zsR@50
87
- value: 2.8
88
- name: zsR@50
89
  - type: mR@100
90
- value: 27.55
91
  name: mR@100
92
  - type: R@100
93
- value: 39.44
94
  name: R@100
95
- - type: zsR@100
96
- value: 3.77
97
- name: zsR@100
98
- - type: mean_recall
99
- value: 26.32
100
- name: mean_recall
101
  - name: REACT++ yolo12l
102
  results:
103
  - task:
@@ -108,35 +138,35 @@ model-index:
108
  type: psg
109
  metrics:
110
  - type: mR@20
111
- value: 23.34
112
  name: mR@20
113
  - type: R@20
114
- value: 29.72
115
  name: R@20
116
- - type: zsR@20
117
- value: 1.74
118
- name: zsR@20
119
  - type: mR@50
120
- value: 25.82
121
  name: mR@50
122
  - type: R@50
123
- value: 35.12
124
  name: R@50
125
- - type: zsR@50
126
- value: 2.77
127
- name: zsR@50
128
  - type: mR@100
129
- value: 27.47
130
  name: mR@100
131
  - type: R@100
132
- value: 37.99
133
  name: R@100
134
- - type: zsR@100
135
- value: 3.53
136
- name: zsR@100
137
- - type: mean_recall
138
- value: 33.16
139
- name: mean_recall
140
  - name: REACT++ yolov8m
141
  results:
142
  - task:
@@ -147,35 +177,35 @@ model-index:
147
  type: psg
148
  metrics:
149
  - type: mR@20
150
- value: 2.82
151
  name: mR@20
152
  - type: R@20
153
- value: 10.02
154
  name: R@20
155
- - type: zsR@20
156
- value: 1.97
157
- name: zsR@20
158
  - type: mR@50
159
- value: 4.57
160
  name: mR@50
161
  - type: R@50
162
- value: 13.75
163
  name: R@50
164
- - type: zsR@50
165
- value: 2.8
166
- name: zsR@50
167
  - type: mR@100
168
- value: 5.98
169
  name: mR@100
170
  - type: R@100
171
- value: 16.24
172
  name: R@100
173
- - type: zsR@100
174
- value: 3.49
175
- name: zsR@100
176
- - type: mean_recall
177
- value: 21.42
178
- name: mean_recall
179
  ---
180
 
181
  # REACT++ Scene Graph Generation — PSG (yolo12n, yolo12s, yolo12m, yolo12l, yolov8m)
@@ -186,44 +216,62 @@ on the **PSG** benchmark, across 5 backbone sizes.
186
  REACT++ is a parameter-efficient, attention-augmented relation predictor built on top of
187
  a YOLO12 backbone. It uses:
188
 
 
189
  - **SwiGLU gated MLP** for all feed-forward blocks (½ the params of ReLU-MLP at equal capacity)
190
- - **Visual × Semantic cross-attention** — visual tokens attend to GloVe prototype embeddings
191
  - **Geometry RoPE** — box-position encoded as a rotary frequency bias on the Q matrix
192
- - **Prototype Momentum Buffer** — per-class EMA prototype bank (MoCo/DINO-style)
193
  - **P5 Scene Context** — AIFI-enhanced P5 tokens provide global context via cross-attention
194
 
195
  The models were trained with the
196
  [SGG-Benchmark](https://github.com/Maelic/SGG-Benchmark) framework and described in the
197
- [REACT paper (Neau et al., BMVC 2025)](https://arxiv.org/abs/2405.16116).
198
 
199
  ---
200
 
201
- ## Results — SGDet on PSG test split
202
 
203
- | Backbone | Params (backbone) | mR@20 | mR@50 | mR@100 | R@20 | R@50 | R@100 |
204
- |----------|:-----------------:|------:|------:|-------:|-----:|-----:|------:|
205
- | yolo12n | ~2.6M | - | - | - | - | - | - |
206
- | yolo12s | ~9.2M | 2.91 | 3.93 | 4.62 | 6.71 | 9.28 | 11.21 |
207
- | yolo12m | ~20.2M | 22.73 | 25.75 | 27.55 | 31.11 | 36.29 | 39.44 |
208
- | yolo12l | ~26.5M | 23.34 | 25.82 | 27.47 | 29.72 | 35.12 | 37.99 |
209
- | yolov8m | ~25.9M | 2.82 | 4.57 | 5.98 | 10.02 | 13.75 | 16.24 |
 
 
210
 
211
  ---
212
 
213
  ## Checkpoints
214
 
215
- | Variant | Sub-folder | Checkpoint file |
216
  |---------|------------|-----------------|
217
- | yolo12n | `yolo12n/` | `yolo12n/best_model_epoch_5.pth` |
218
- | yolo12s | `yolo12s/` | `yolo12s/best_model_epoch_6.pth` |
219
- | yolo12m | `yolo12m/` | `yolo12m/best_model_epoch_9.pth` |
220
- | yolo12l | `yolo12l/` | `yolo12l/best_model_epoch_9.pth` |
221
- | yolov8m | `yolov8m/` | `yolov8m/best_model_epoch_6.pth` |
222
 
223
  ---
224
 
225
  ## Usage
226
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
227
  ```python
228
  # 1. Clone the repository
229
  # git clone https://github.com/Maelic/SGG-Benchmark
@@ -231,7 +279,7 @@ The models were trained with the
231
  # 2. Install dependencies
232
  # pip install -e .
233
 
234
- # 3. Download a checkpoint
235
  from huggingface_hub import hf_hub_download
236
 
237
  ckpt_path = hf_hub_download(
@@ -248,7 +296,7 @@ cfg_path = hf_hub_download(
248
  # 4. Run evaluation
249
  import subprocess
250
  subprocess.run([
251
- "python", "tools/relation_train_net_hydra.py",
252
  "--config-path", str(cfg_path),
253
  "--task", "sgdet",
254
  "--eval-only",
@@ -261,11 +309,11 @@ subprocess.run([
261
  ## Citation
262
 
263
  ```bibtex
264
- @inproceedings{neau2025react,
265
- title = {REACT: Relation Extraction through Attention-guided Contrastive Training},
266
- author = {Neau, Maëlic and others},
267
- booktitle = {BMVC},
268
- year = {2025},
269
- url = {https://arxiv.org/abs/2405.16116},
270
  }
271
  ```
 
19
  dataset:
20
  name: PSG
21
  type: psg
22
+ metrics:
23
+ - type: mR@20
24
+ value: 16.88
25
+ name: mR@20
26
+ - type: R@20
27
+ value: 26.88
28
+ name: R@20
29
+ - type: F1@20
30
+ value: 20.74
31
+ name: F1@20
32
+ - type: mR@50
33
+ value: 18.65
34
+ name: mR@50
35
+ - type: R@50
36
+ value: 30.61
37
+ name: R@50
38
+ - type: F1@50
39
+ value: 23.17
40
+ name: F1@50
41
+ - type: mR@100
42
+ value: 19.5
43
+ name: mR@100
44
+ - type: R@100
45
+ value: 31.8
46
+ name: R@100
47
+ - type: F1@100
48
+ value: 24.17
49
+ name: F1@100
50
+ - type: e2e_latency_ms
51
+ value: 11.4
52
+ name: e2e_latency_ms
53
  - name: REACT++ yolo12s
54
  results:
55
  - task:
 
60
  type: psg
61
  metrics:
62
  - type: mR@20
63
+ value: 21.12
64
  name: mR@20
65
  - type: R@20
66
+ value: 29.28
67
  name: R@20
68
+ - type: F1@20
69
+ value: 24.54
70
+ name: F1@20
71
  - type: mR@50
72
+ value: 23.21
73
  name: mR@50
74
  - type: R@50
75
+ value: 33.48
76
  name: R@50
77
+ - type: F1@50
78
+ value: 27.41
79
+ name: F1@50
80
  - type: mR@100
81
+ value: 23.77
82
  name: mR@100
83
  - type: R@100
84
+ value: 34.74
85
  name: R@100
86
+ - type: F1@100
87
+ value: 28.23
88
+ name: F1@100
89
+ - type: e2e_latency_ms
90
+ value: 12.2
91
+ name: e2e_latency_ms
92
  - name: REACT++ yolo12m
93
  results:
94
  - task:
 
99
  type: psg
100
  metrics:
101
  - type: mR@20
102
+ value: 22.74
103
  name: mR@20
104
  - type: R@20
105
+ value: 32.69
106
  name: R@20
107
+ - type: F1@20
108
+ value: 26.82
109
+ name: F1@20
110
  - type: mR@50
111
+ value: 25.21
112
  name: mR@50
113
  - type: R@50
114
+ value: 37.2
115
  name: R@50
116
+ - type: F1@50
117
+ value: 30.05
118
+ name: F1@50
119
  - type: mR@100
120
+ value: 26.08
121
  name: mR@100
122
  - type: R@100
123
+ value: 38.58
124
  name: R@100
125
+ - type: F1@100
126
+ value: 31.12
127
+ name: F1@100
128
+ - type: e2e_latency_ms
129
+ value: 15.7
130
+ name: e2e_latency_ms
131
  - name: REACT++ yolo12l
132
  results:
133
  - task:
 
138
  type: psg
139
  metrics:
140
  - type: mR@20
141
+ value: 23.2
142
  name: mR@20
143
  - type: R@20
144
+ value: 30.99
145
  name: R@20
146
+ - type: F1@20
147
+ value: 26.53
148
+ name: F1@20
149
  - type: mR@50
150
+ value: 25.49
151
  name: mR@50
152
  - type: R@50
153
+ value: 35.3
154
  name: R@50
155
+ - type: F1@50
156
+ value: 29.6
157
+ name: F1@50
158
  - type: mR@100
159
+ value: 26.45
160
  name: mR@100
161
  - type: R@100
162
+ value: 36.68
163
  name: R@100
164
+ - type: F1@100
165
+ value: 30.74
166
+ name: F1@100
167
+ - type: e2e_latency_ms
168
+ value: 19.6
169
+ name: e2e_latency_ms
170
  - name: REACT++ yolov8m
171
  results:
172
  - task:
 
177
  type: psg
178
  metrics:
179
  - type: mR@20
180
+ value: 22.75
181
  name: mR@20
182
  - type: R@20
183
+ value: 30.69
184
  name: R@20
185
+ - type: F1@20
186
+ value: 26.13
187
+ name: F1@20
188
  - type: mR@50
189
+ value: 25.46
190
  name: mR@50
191
  - type: R@50
192
+ value: 35.68
193
  name: R@50
194
+ - type: F1@50
195
+ value: 29.72
196
+ name: F1@50
197
  - type: mR@100
198
+ value: 26.4
199
  name: mR@100
200
  - type: R@100
201
+ value: 37.43
202
  name: R@100
203
+ - type: F1@100
204
+ value: 30.96
205
+ name: F1@100
206
+ - type: e2e_latency_ms
207
+ value: 15.3
208
+ name: e2e_latency_ms
209
  ---
210
 
211
  # REACT++ Scene Graph Generation — PSG (yolo12n, yolo12s, yolo12m, yolo12l, yolov8m)
 
216
  REACT++ is a parameter-efficient, attention-augmented relation predictor built on top of
217
  a YOLO12 backbone. It uses:
218
 
219
+ - **DAMP** (Detection-Anchored Multi-Scale Pooling), a new simple pooling algorithm for one-stage object detectors such as YOLO
220
  - **SwiGLU gated MLP** for all feed-forward blocks (½ the params of ReLU-MLP at equal capacity)
221
+ - **Visual x Semantic cross-attention** — visual tokens attend to GloVe prototype embeddings
222
  - **Geometry RoPE** — box-position encoded as a rotary frequency bias on the Q matrix
223
+ - **Prototype Momentum Buffer** — per-class EMA prototype bank
224
  - **P5 Scene Context** — AIFI-enhanced P5 tokens provide global context via cross-attention
225
 
226
  The models were trained with the
227
  [SGG-Benchmark](https://github.com/Maelic/SGG-Benchmark) framework and described in the
228
+ [REACT++ paper (Neau et al., 2026)](https://arxiv.org/abs/2603.06386).
229
 
230
  ---
231
 
232
+ ## Results — SGDet on PSG test split (ONNX, CUDA)
233
 
234
+ > Metrics from end-to-end ONNX evaluation (`tools/eval_onnx_psg.py`). E2E Latency = image load + pre-process + ONNX forward.
235
+
236
+ | Backbone | Params | R@20 | R@50 | R@100 | mR@20 | mR@50 | mR@100 | F1@20 | F1@50 | F1@100 | E2E Lat. (ms) |
237
+ |----------|:------:|-----:|-----:|------:|------:|------:|-------:|------:|------:|-------:|--------------:|
238
+ | yolo12n | ~2.6M | 26.88 | 30.61 | 31.8 | 16.88 | 18.65 | 19.5 | 20.74 | 23.17 | 24.17 | 11.4 |
239
+ | yolo12s | ~9.2M | 29.28 | 33.48 | 34.74 | 21.12 | 23.21 | 23.77 | 24.54 | 27.41 | 28.23 | 12.2 |
240
+ | yolo12m | ~20.2M | 32.69 | 37.2 | 38.58 | 22.74 | 25.21 | 26.08 | 26.82 | 30.05 | 31.12 | 15.7 |
241
+ | yolo12l | ~26.5M | 30.99 | 35.3 | 36.68 | 23.2 | 25.49 | 26.45 | 26.53 | 29.6 | 30.74 | 19.6 |
242
+ | yolov8m | ~25.9M | 30.69 | 35.68 | 37.43 | 22.75 | 25.46 | 26.4 | 26.13 | 29.72 | 30.96 | 15.3 |
243
 
244
  ---
245
 
246
  ## Checkpoints
247
 
248
+ | Variant | Sub-folder | Checkpoint files |
249
  |---------|------------|-----------------|
250
+ | yolo12n | `yolo12n/` | `yolo12n/model.onnx` (ONNX) · `yolo12n/best_model_epoch_5.pth` (PyTorch) |
251
+ | yolo12s | `yolo12s/` | `yolo12s/model.onnx` (ONNX) · `yolo12s/best_model_epoch_6.pth` (PyTorch) |
252
+ | yolo12m | `yolo12m/` | `yolo12m/model.onnx` (ONNX) · `yolo12m/best_model_epoch_9.pth` (PyTorch) |
253
+ | yolo12l | `yolo12l/` | `yolo12l/model.onnx` (ONNX) · `yolo12l/best_model_epoch_9.pth` (PyTorch) |
254
+ | yolov8m | `yolov8m/` | `yolov8m/model.onnx` (ONNX) · `yolov8m/best_model_epoch_6.pth` (PyTorch) |
255
 
256
  ---
257
 
258
  ## Usage
259
 
260
+ ### ONNX (recommended — no Python dependencies beyond onnxruntime)
261
+
262
+ ```python
263
+ from huggingface_hub import hf_hub_download
264
+
265
+ onnx_path = hf_hub_download(
266
+ repo_id="maelic/REACTPlusPlus_PSG",
267
+ filename="yolo12n/model.onnx",
268
+ repo_type="model",
269
+ )
270
+ # Run with tools/eval_onnx_psg.py or load directly via onnxruntime
271
+ ```
272
+
273
+ ### PyTorch
274
+
275
  ```python
276
  # 1. Clone the repository
277
  # git clone https://github.com/Maelic/SGG-Benchmark
 
279
  # 2. Install dependencies
280
  # pip install -e .
281
 
282
+ # 3. Download checkpoint + config
283
  from huggingface_hub import hf_hub_download
284
 
285
  ckpt_path = hf_hub_download(
 
296
  # 4. Run evaluation
297
  import subprocess
298
  subprocess.run([
299
+ "python", "tools/relation_eval_hydra.py",
300
  "--config-path", str(cfg_path),
301
  "--task", "sgdet",
302
  "--eval-only",
 
309
  ## Citation
310
 
311
  ```bibtex
312
+ @article{neau2026reactpp,
313
+ title = {REACT++: Efficient Cross-Attention for Real-Time Scene Graph Generation
314
+ },
315
+ author = {Neau, Maëlic and Falomir, Zoe},
316
+ year = {2026},
317
+ url = {https://arxiv.org/abs/2603.06386},
318
  }
319
  ```