REACT-pp-VG150 / README.md

maelic

Update model card

f7a728b verified about 9 hours ago

preview code

raw

history blame contribute delete

6.38 kB

metadata

license: apache-2.0
language:
  - en
tags:
  - scene-graph-generation
  - object-detection
  - visual-relationship-detection
  - pytorch
  - yolo
pipeline_tag: object-detection
library_name: sgg-benchmark
model-index:
  - name: REACT++ yolo12m
    results:
      - task:
          type: object-detection
          name: Scene Graph Detection
        dataset:
          name: VG150
          type: vg150
        metrics:
          - type: mR@20
            value: 10.52
            name: mR@20
          - type: R@20
            value: 18.32
            name: R@20
          - type: F1@20
            value: 13.36
            name: F1@20
          - type: mR@50
            value: 13.22
            name: mR@50
          - type: R@50
            value: 22.54
            name: R@50
          - type: F1@50
            value: 16.67
            name: F1@50
          - type: mR@100
            value: 13.96
            name: mR@100
          - type: R@100
            value: 23.77
            name: R@100
          - type: F1@100
            value: 17.59
            name: F1@100
          - type: e2e_latency_ms
            value: 19.4
            name: e2e_latency_ms
  - name: REACT++ yolo26m
    results:
      - task:
          type: object-detection
          name: Scene Graph Detection
        dataset:
          name: VG150
          type: vg150
        metrics:
          - type: mR@20
            value: 10.32
            name: mR@20
          - type: R@20
            value: 20
            name: R@20
          - type: mR@50
            value: 13.94
            name: mR@50
          - type: R@50
            value: 26.9
            name: R@50
          - type: mR@100
            value: 16.48
            name: mR@100
          - type: R@100
            value: 32.08
            name: R@100
          - type: mean_recall
            value: 21.87
            name: mean_recall
  - name: REACT++ yolov8m
    results:
      - task:
          type: object-detection
          name: Scene Graph Detection
        dataset:
          name: VG150
          type: vg150
        metrics:
          - type: mR@20
            value: 12.05
            name: mR@20
          - type: R@20
            value: 22.78
            name: R@20
          - type: F1@20
            value: 15.76
            name: F1@20
          - type: mR@50
            value: 15.42
            name: mR@50
          - type: R@50
            value: 28.73
            name: R@50
          - type: F1@50
            value: 20.07
            name: F1@50
          - type: mR@100
            value: 16.51
            name: mR@100
          - type: R@100
            value: 30.84
            name: R@100
          - type: F1@100
            value: 21.51
            name: F1@100
          - type: e2e_latency_ms
            value: 17.8
            name: e2e_latency_ms

REACT++ Scene Graph Generation — VG150 (yolo12m, yolo26m, yolov8m)

This repository contains REACT++ model checkpoints for scene graph generation (SGG) on the VG150 benchmark, across 3 backbone sizes.

REACT++ is a parameter-efficient, attention-augmented relation predictor built on top of a YOLO backbone. It uses:

DAMP (Detection-Anchored Multi-Scale Pooling), a new simple pooling algorithm for one-stage object detectors such as YOLO
SwiGLU gated MLP for all feed-forward blocks (½ the params of ReLU-MLP at equal capacity)
Visual x Semantic cross-attention — visual tokens attend to GloVe prototype embeddings
Geometry RoPE — box-position encoded as a rotary frequency bias on the Q matrix
Prototype Momentum Buffer — per-class EMA prototype bank
P5 Scene Context — AIFI-enhanced P5 tokens provide global context via cross-attention

The models were trained with the SGG-Benchmark framework and described in the REACT++ paper (Neau et al., 2026).

Results — SGDet on VG150 test split (ONNX, CUDA)

Metrics from end-to-end ONNX evaluation (tools/eval_onnx_psg.py). E2E Latency = image load + pre-process + ONNX forward.

Backbone	Params	R@20	R@50	R@100	mR@20	mR@50	mR@100	F1@20	F1@50	F1@100	E2E Lat. (ms)
yolo12m	~20.2M	18.32	22.54	23.77	10.52	13.22	13.96	13.36	16.67	17.59	19.4
yolo26m	~20.2M	20.0	26.9	32.08	10.32	13.94	16.48	-	-	-	-
yolov8m	~25.9M	22.78	28.73	30.84	12.05	15.42	16.51	15.76	20.07	21.51	17.8

Checkpoints

Variant	Sub-folder	Checkpoint files
yolo12m	`yolo12m/`	`yolo12m/model.onnx` (ONNX) · `yolo12m/best_model_epoch_19.pth` (PyTorch)
yolo26m	`yolo26m/`	`yolo26m/react_pp_yolo26m.onnx` (ONNX) · `yolo26m/best_model_epoch_18.pth` (PyTorch)
yolov8m	`yolov8m/`	`yolov8m/model.onnx` (ONNX) · `yolov8m/best_model_epoch_6.pth` (PyTorch)

Usage

ONNX (recommended — no Python dependencies beyond onnxruntime)

from huggingface_hub import hf_hub_download

onnx_path = hf_hub_download(
    repo_id="maelic/REACT-pp-VG150",
    filename="yolo12m/react_pp_yolo12m.onnx",
    repo_type="model",
)
# Run with tools/eval_onnx_psg.py or load directly via onnxruntime

PyTorch

# 1. Clone the repository
#    git clone https://github.com/Maelic/SGG-Benchmark

# 2. Install dependencies
#    pip install -e .

# 3. Download checkpoint + config
from huggingface_hub import hf_hub_download

ckpt_path = hf_hub_download(
    repo_id="maelic/REACT-pp-VG150",
    filename="yolo12m/best_model.pth",
    repo_type="model",
)
cfg_path = hf_hub_download(
    repo_id="maelic/REACT-pp-VG150",
    filename="yolo12m/config.yml",
    repo_type="model",
)

# 4. Run evaluation
import subprocess
subprocess.run([
    "python", "tools/relation_eval_hydra.py",
    "--config-path", str(cfg_path),
    "--task", "sgdet",
    "--eval-only",
    "--checkpoint", str(ckpt_path),
])

Citation

@article{neau2026reactpp,
  title   = {REACT++: Efficient Cross-Attention for Real-Time Scene Graph Generation
},
  author  = {Neau, Maëlic and Falomir, Zoe},
  year    = {2026},
  url     = {https://arxiv.org/abs/2603.06386},
}