metadata
license: apache-2.0
language:
- en
tags:
- scene-graph-generation
- object-detection
- visual-relationship-detection
- pytorch
- yolo
pipeline_tag: object-detection
library_name: sgg-benchmark
model-index:
- name: REACT++ yolo12n
results:
- task:
type: object-detection
name: Scene Graph Detection
dataset:
name: PSG
type: psg
metrics: []
- name: REACT++ yolo12s
results:
- task:
type: object-detection
name: Scene Graph Detection
dataset:
name: PSG
type: psg
metrics:
- type: mR@20
value: 2.91
name: mR@20
- type: R@20
value: 6.71
name: R@20
- type: zsR@20
value: 1.82
name: zsR@20
- type: mR@50
value: 3.93
name: mR@50
- type: R@50
value: 9.28
name: R@50
- type: zsR@50
value: 2.66
name: zsR@50
- type: mR@100
value: 4.62
name: mR@100
- type: R@100
value: 11.21
name: R@100
- type: zsR@100
value: 3.22
name: zsR@100
- type: mean_recall
value: 24.71
name: mean_recall
- name: REACT++ yolo12m
results:
- task:
type: object-detection
name: Scene Graph Detection
dataset:
name: PSG
type: psg
metrics:
- type: mR@20
value: 22.73
name: mR@20
- type: R@20
value: 31.11
name: R@20
- type: zsR@20
value: 1.81
name: zsR@20
- type: mR@50
value: 25.75
name: mR@50
- type: R@50
value: 36.29
name: R@50
- type: zsR@50
value: 2.8
name: zsR@50
- type: mR@100
value: 27.55
name: mR@100
- type: R@100
value: 39.44
name: R@100
- type: zsR@100
value: 3.77
name: zsR@100
- type: mean_recall
value: 26.32
name: mean_recall
- name: REACT++ yolo12l
results:
- task:
type: object-detection
name: Scene Graph Detection
dataset:
name: PSG
type: psg
metrics:
- type: mR@20
value: 23.34
name: mR@20
- type: R@20
value: 29.72
name: R@20
- type: zsR@20
value: 1.74
name: zsR@20
- type: mR@50
value: 25.82
name: mR@50
- type: R@50
value: 35.12
name: R@50
- type: zsR@50
value: 2.77
name: zsR@50
- type: mR@100
value: 27.47
name: mR@100
- type: R@100
value: 37.99
name: R@100
- type: zsR@100
value: 3.53
name: zsR@100
- type: mean_recall
value: 33.16
name: mean_recall
- name: REACT++ yolov8m
results:
- task:
type: object-detection
name: Scene Graph Detection
dataset:
name: PSG
type: psg
metrics:
- type: mR@20
value: 2.82
name: mR@20
- type: R@20
value: 10.02
name: R@20
- type: zsR@20
value: 1.97
name: zsR@20
- type: mR@50
value: 4.57
name: mR@50
- type: R@50
value: 13.75
name: R@50
- type: zsR@50
value: 2.8
name: zsR@50
- type: mR@100
value: 5.98
name: mR@100
- type: R@100
value: 16.24
name: R@100
- type: zsR@100
value: 3.49
name: zsR@100
- type: mean_recall
value: 21.42
name: mean_recall
REACT++ Scene Graph Generation — PSG (yolo12n, yolo12s, yolo12m, yolo12l, yolov8m)
This repository contains REACT++ model checkpoints for scene graph generation (SGG) on the PSG benchmark, across 5 backbone sizes.
REACT++ is a parameter-efficient, attention-augmented relation predictor built on top of a YOLO12 backbone. It uses:
- SwiGLU gated MLP for all feed-forward blocks (½ the params of ReLU-MLP at equal capacity)
- Visual × Semantic cross-attention — visual tokens attend to GloVe prototype embeddings
- Geometry RoPE — box-position encoded as a rotary frequency bias on the Q matrix
- Prototype Momentum Buffer — per-class EMA prototype bank (MoCo/DINO-style)
- P5 Scene Context — AIFI-enhanced P5 tokens provide global context via cross-attention
The models were trained with the SGG-Benchmark framework and described in the REACT paper (Neau et al., BMVC 2025).
Results — SGDet on PSG test split
| Backbone | Params (backbone) | mR@20 | mR@50 | mR@100 | R@20 | R@50 | R@100 |
|---|---|---|---|---|---|---|---|
| yolo12n | ~2.6M | - | - | - | - | - | - |
| yolo12s | ~9.2M | 2.91 | 3.93 | 4.62 | 6.71 | 9.28 | 11.21 |
| yolo12m | ~20.2M | 22.73 | 25.75 | 27.55 | 31.11 | 36.29 | 39.44 |
| yolo12l | ~26.5M | 23.34 | 25.82 | 27.47 | 29.72 | 35.12 | 37.99 |
| yolov8m | ~25.9M | 2.82 | 4.57 | 5.98 | 10.02 | 13.75 | 16.24 |
Checkpoints
| Variant | Sub-folder | Checkpoint file |
|---|---|---|
| yolo12n | yolo12n/ |
yolo12n/best_model_epoch_5.pth |
| yolo12s | yolo12s/ |
yolo12s/best_model_epoch_6.pth |
| yolo12m | yolo12m/ |
yolo12m/best_model_epoch_9.pth |
| yolo12l | yolo12l/ |
yolo12l/best_model_epoch_9.pth |
| yolov8m | yolov8m/ |
yolov8m/best_model_epoch_6.pth |
Usage
# 1. Clone the repository
# git clone https://github.com/Maelic/SGG-Benchmark
# 2. Install dependencies
# pip install -e .
# 3. Download a checkpoint
from huggingface_hub import hf_hub_download
ckpt_path = hf_hub_download(
repo_id="maelic/REACTPlusPlus_PSG",
filename="yolo12n/best_model.pth",
repo_type="model",
)
cfg_path = hf_hub_download(
repo_id="maelic/REACTPlusPlus_PSG",
filename="yolo12n/hydra_config.yaml",
repo_type="model",
)
# 4. Run evaluation
import subprocess
subprocess.run([
"python", "tools/relation_train_net_hydra.py",
"--config-path", str(cfg_path),
"--task", "sgdet",
"--eval-only",
"--checkpoint", str(ckpt_path),
])
Citation
@inproceedings{neau2025react,
title = {REACT: Relation Extraction through Attention-guided Contrastive Training},
author = {Neau, Maëlic and others},
booktitle = {BMVC},
year = {2025},
url = {https://arxiv.org/abs/2405.16116},
}