File size: 3,545 Bytes
f9086c8 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 | ---
license: apache-2.0
language:
- en
tags:
- scene-graph-generation
- object-detection
- visual-relationship-detection
- pytorch
- yolo
pipeline_tag: object-detection
library_name: sgg-benchmark
model-index:
- name: REACT++ yolov8m
results:
- task:
type: object-detection
name: Scene Graph Detection
dataset:
name: IndoorVG
type: indoorvg
metrics: []
---
# REACT++ Scene Graph Generation — IndoorVG (yolov8m)
This repository contains **REACT++** model checkpoints for scene graph generation (SGG)
on the **IndoorVG** benchmark, across 1 backbone size.
REACT++ is a parameter-efficient, attention-augmented relation predictor built on top of
a YOLO backbone. It uses:
- **DAMP** (Detection-Anchored Multi-Scale Pooling), a new simple pooling algorithm for one-stage object detectors such as YOLO
- **SwiGLU gated MLP** for all feed-forward blocks (½ the params of ReLU-MLP at equal capacity)
- **Visual x Semantic cross-attention** — visual tokens attend to GloVe prototype embeddings
- **Geometry RoPE** — box-position encoded as a rotary frequency bias on the Q matrix
- **Prototype Momentum Buffer** — per-class EMA prototype bank
- **P5 Scene Context** — AIFI-enhanced P5 tokens provide global context via cross-attention
The models were trained with the
[SGG-Benchmark](https://github.com/Maelic/SGG-Benchmark) framework and described in the
[REACT++ paper (Neau et al., 2026)](https://arxiv.org/abs/2603.06386).
---
## Results — SGDet on IndoorVG test split (ONNX, CUDA)
> Metrics from end-to-end ONNX evaluation (`tools/eval_onnx_psg.py`). E2E Latency = image load + pre-process + ONNX forward.
| Backbone | Params | R@20 | R@50 | R@100 | mR@20 | mR@50 | mR@100 | F1@20 | F1@50 | F1@100 | E2E Lat. (ms) |
|----------|:------:|-----:|-----:|------:|------:|------:|-------:|------:|------:|-------:|--------------:|
| yolov8m | ~25.9M | - | - | - | - | - | - | - | - | - | - |
---
## Checkpoints
| Variant | Sub-folder | Checkpoint files |
|---------|------------|-----------------|
| yolov8m | `yolov8m/` | `yolov8m/model.onnx` (ONNX) · `yolov8m/best_model_epoch_8.pth` (PyTorch) |
---
## Usage
### ONNX (recommended — no Python dependencies beyond onnxruntime)
```python
from huggingface_hub import hf_hub_download
onnx_path = hf_hub_download(
repo_id="maelic/REACTPlusPlus_IndoorVG",
filename="yolov8m/react_pp_yolo12m.onnx",
repo_type="model",
)
# Run with tools/eval_onnx_psg.py or load directly via onnxruntime
```
### PyTorch
```python
# 1. Clone the repository
# git clone https://github.com/Maelic/SGG-Benchmark
# 2. Install dependencies
# pip install -e .
# 3. Download checkpoint + config
from huggingface_hub import hf_hub_download
ckpt_path = hf_hub_download(
repo_id="maelic/REACTPlusPlus_IndoorVG",
filename="yolov8m/best_model.pth",
repo_type="model",
)
cfg_path = hf_hub_download(
repo_id="maelic/REACTPlusPlus_IndoorVG",
filename="yolov8m/config.yml",
repo_type="model",
)
# 4. Run evaluation
import subprocess
subprocess.run([
"python", "tools/relation_eval_hydra.py",
"--config-path", str(cfg_path),
"--task", "sgdet",
"--eval-only",
"--checkpoint", str(ckpt_path),
])
```
---
## Citation
```bibtex
@article{neau2026reactpp,
title = {REACT++: Efficient Cross-Attention for Real-Time Scene Graph Generation
},
author = {Neau, Maëlic and Falomir, Zoe},
year = {2026},
url = {https://arxiv.org/abs/2603.06386},
}
```
|