REACTPlusPlus_PSG / README.md
maelic's picture
Update model card
2760915 verified
---
license: apache-2.0
language:
- en
tags:
- scene-graph-generation
- object-detection
- visual-relationship-detection
- pytorch
- yolo
pipeline_tag: object-detection
library_name: sgg-benchmark
model-index:
- name: REACT++ yolo12l
results:
- task:
type: object-detection
name: Scene Graph Detection
dataset:
name: PSG
type: psg
metrics:
- type: mR@20
value: 23.2
name: mR@20
- type: R@20
value: 30.99
name: R@20
- type: F1@20
value: 26.53
name: F1@20
- type: mR@50
value: 25.49
name: mR@50
- type: R@50
value: 35.3
name: R@50
- type: F1@50
value: 29.6
name: F1@50
- type: mR@100
value: 26.45
name: mR@100
- type: R@100
value: 36.68
name: R@100
- type: F1@100
value: 30.74
name: F1@100
- type: e2e_latency_ms
value: 19.6
name: e2e_latency_ms
- name: REACT++ yolo12m
results:
- task:
type: object-detection
name: Scene Graph Detection
dataset:
name: PSG
type: psg
metrics:
- type: mR@20
value: 22.74
name: mR@20
- type: R@20
value: 32.69
name: R@20
- type: F1@20
value: 26.82
name: F1@20
- type: mR@50
value: 25.21
name: mR@50
- type: R@50
value: 37.2
name: R@50
- type: F1@50
value: 30.05
name: F1@50
- type: mR@100
value: 26.08
name: mR@100
- type: R@100
value: 38.58
name: R@100
- type: F1@100
value: 31.12
name: F1@100
- type: e2e_latency_ms
value: 15.7
name: e2e_latency_ms
- name: REACT++ yolo12s
results:
- task:
type: object-detection
name: Scene Graph Detection
dataset:
name: PSG
type: psg
metrics:
- type: mR@20
value: 21.12
name: mR@20
- type: R@20
value: 29.28
name: R@20
- type: F1@20
value: 24.54
name: F1@20
- type: mR@50
value: 23.21
name: mR@50
- type: R@50
value: 33.48
name: R@50
- type: F1@50
value: 27.41
name: F1@50
- type: mR@100
value: 23.77
name: mR@100
- type: R@100
value: 34.74
name: R@100
- type: F1@100
value: 28.23
name: F1@100
- type: e2e_latency_ms
value: 12.2
name: e2e_latency_ms
- name: REACT++ yolo12n
results:
- task:
type: object-detection
name: Scene Graph Detection
dataset:
name: PSG
type: psg
metrics:
- type: mR@20
value: 16.88
name: mR@20
- type: R@20
value: 26.88
name: R@20
- type: F1@20
value: 20.74
name: F1@20
- type: mR@50
value: 18.65
name: mR@50
- type: R@50
value: 30.61
name: R@50
- type: F1@50
value: 23.17
name: F1@50
- type: mR@100
value: 19.5
name: mR@100
- type: R@100
value: 31.8
name: R@100
- type: F1@100
value: 24.17
name: F1@100
- type: e2e_latency_ms
value: 11.4
name: e2e_latency_ms
- name: REACT++ yolov8m
results:
- task:
type: object-detection
name: Scene Graph Detection
dataset:
name: PSG
type: psg
metrics:
- type: mR@20
value: 22.75
name: mR@20
- type: R@20
value: 30.69
name: R@20
- type: F1@20
value: 26.13
name: F1@20
- type: mR@50
value: 25.46
name: mR@50
- type: R@50
value: 35.68
name: R@50
- type: F1@50
value: 29.72
name: F1@50
- type: mR@100
value: 26.4
name: mR@100
- type: R@100
value: 37.43
name: R@100
- type: F1@100
value: 30.96
name: F1@100
- type: e2e_latency_ms
value: 15.3
name: e2e_latency_ms
---
# REACT++ Scene Graph Generation — PSG (yolo12l, yolo12m, yolo12s, yolo12n, yolov8m)
This repository contains **REACT++** model checkpoints for scene graph generation (SGG)
on the **PSG** benchmark, across 5 backbone sizes.
REACT++ is a parameter-efficient, attention-augmented relation predictor built on top of
a YOLO12 backbone. It uses:
- **DAMP** (Detection-Anchored Multi-Scale Pooling), a new simple pooling algorithm for one-stage object detectors such as YOLO
- **SwiGLU gated MLP** for all feed-forward blocks (½ the params of ReLU-MLP at equal capacity)
- **Visual x Semantic cross-attention** — visual tokens attend to GloVe prototype embeddings
- **Geometry RoPE** — box-position encoded as a rotary frequency bias on the Q matrix
- **Prototype Momentum Buffer** — per-class EMA prototype bank
- **P5 Scene Context** — AIFI-enhanced P5 tokens provide global context via cross-attention
The models were trained with the
[SGG-Benchmark](https://github.com/Maelic/SGG-Benchmark) framework and described in the
[REACT++ paper (Neau et al., 2026)](https://arxiv.org/abs/2603.06386).
---
## Results — SGDet on PSG test split (ONNX, CUDA)
> Metrics from end-to-end ONNX evaluation (`tools/eval_onnx_psg.py`). E2E Latency = image load + pre-process + ONNX forward.
| Backbone | Params | R@20 | R@50 | R@100 | mR@20 | mR@50 | mR@100 | F1@20 | F1@50 | F1@100 | E2E Lat. (ms) |
|----------|:------:|-----:|-----:|------:|------:|------:|-------:|------:|------:|-------:|--------------:|
| yolo12l | ~26.5M | 30.99 | 35.3 | 36.68 | 23.2 | 25.49 | 26.45 | 26.53 | 29.6 | 30.74 | 19.6 |
| yolo12m | ~20.2M | 32.69 | 37.2 | 38.58 | 22.74 | 25.21 | 26.08 | 26.82 | 30.05 | 31.12 | 15.7 |
| yolo12s | ~9.2M | 29.28 | 33.48 | 34.74 | 21.12 | 23.21 | 23.77 | 24.54 | 27.41 | 28.23 | 12.2 |
| yolo12n | ~2.6M | 26.88 | 30.61 | 31.8 | 16.88 | 18.65 | 19.5 | 20.74 | 23.17 | 24.17 | 11.4 |
| yolov8m | ~25.9M | 30.69 | 35.68 | 37.43 | 22.75 | 25.46 | 26.4 | 26.13 | 29.72 | 30.96 | 15.3 |
---
## Checkpoints
| Variant | Sub-folder | Checkpoint files |
|---------|------------|-----------------|
| yolo12l | `yolo12l/` | `yolo12l/model.onnx` (ONNX) · `yolo12l/best_model_epoch_9.pth` (PyTorch) |
| yolo12m | `yolo12m/` | `yolo12m/model.onnx` (ONNX) · `yolo12m/best_model_epoch_9.pth` (PyTorch) |
| yolo12s | `yolo12s/` | `yolo12s/model.onnx` (ONNX) · `yolo12s/best_model_epoch_6.pth` (PyTorch) |
| yolo12n | `yolo12n/` | `yolo12n/model.onnx` (ONNX) · `yolo12n/best_model_epoch_5.pth` (PyTorch) |
| yolov8m | `yolov8m/` | `yolov8m/model.onnx` (ONNX) · `yolov8m/best_model_epoch_6.pth` (PyTorch) |
---
## Usage
### ONNX (recommended — no Python dependencies beyond onnxruntime)
```python
from huggingface_hub import hf_hub_download
onnx_path = hf_hub_download(
repo_id="maelic/REACTPlusPlus_PSG",
filename="yolo12l/react_pp_yolo12m.onnx",
repo_type="model",
)
# Run with tools/eval_onnx_psg.py or load directly via onnxruntime
```
### PyTorch
```python
# 1. Clone the repository
# git clone https://github.com/Maelic/SGG-Benchmark
# 2. Install dependencies
# pip install -e .
# 3. Download checkpoint + config
from huggingface_hub import hf_hub_download
ckpt_path = hf_hub_download(
repo_id="maelic/REACTPlusPlus_PSG",
filename="yolo12l/best_model.pth",
repo_type="model",
)
cfg_path = hf_hub_download(
repo_id="maelic/REACTPlusPlus_PSG",
filename="yolo12l/config.yml",
repo_type="model",
)
# 4. Run evaluation
import subprocess
subprocess.run([
"python", "tools/relation_eval_hydra.py",
"--config-path", str(cfg_path),
"--task", "sgdet",
"--eval-only",
"--checkpoint", str(ckpt_path),
])
```
---
## Citation
```bibtex
@article{neau2026reactpp,
title = {REACT++: Efficient Cross-Attention for Real-Time Scene Graph Generation
},
author = {Neau, Maëlic and Falomir, Zoe},
year = {2026},
url = {https://arxiv.org/abs/2603.06386},
}
```