---
license: mit
---
# LEGION-8B-replicate
## Overview
Since the project [LEGION: Learning to Ground and Explain for Synthetic Image Detection](https://arxiv.org/abs/2503.15264) open-sourced its code repository but did not provide pre-trained weights, we replicated the model by referring to the open-source code and the paper, and are now releasing our replicated weights.
> [!NOTE]
> Due to potential discrepancies in the replication process, the released weights may achieve lower scores than officially reported results on certain benchmarks.
### Training Details
We conducted training on 4x A100 40G GPUs.
For the first training stage, the official configuration uses 8 GPUs with a global batch size of 16 (batch size per device = 2). To maintain the same global batch size, we used 4 GPUs with a per-device batch size of 4.
For the second training stage, the official configuration uses 8 GPUs with a global batch size of 512 (batch size per device = 64). We used 4 GPUs with a per-device batch size of 8 and a gradient accumulation step of 16. This results in an effective per-device batch size of 128, maintaining an equivalent global batch size of 512.
### Inference Usage
A simple inference script is provided at [infer.py](./infer.py).
Usage instructions are as follows:
```bash
cp infer.py /path/to/LEGION
python infer.py --model_path /path/to/LEGION-8B-replicate --image_root /path/to/images --save_root /path/to/results
```
### Examples
 |
 |
Upon examining the image. I have found: A cat sits on a rooftop at sunset, with its right front paw missing and the left front paw appearing deformed. To elaborate, I have found the following artifacts. Cat's right front paw :The cat's right front paw is missing. Cat's left front paw :The cat's left front paw is deformed.
## Performance
> [!NOTE]
> Due to the evaluation and metric-related code not being open-sourced, the test results may be inaccurate.
> The IoU evaluation metric for masks may be affected by mask processing during inference, resulting in lower scores.
### Localization
| Method |
SynthScars |
LOKI |
RichHF-18K |
| mIoU |
F1 |
mIoU |
F1 |
mIoU |
F1 |
| HiFi-Net |
45.65 |
0.57 |
39.60 |
2.41 |
44.96 |
0.39 |
| TruFor |
48.60 |
15.29 |
46.55 |
16.70 |
48.41 |
18.03 |
| PAL4VST |
56.10 |
29.21 |
47.34 |
11.58 |
49.88 |
14.78 |
| Ferret |
27.09 |
15.24 |
24.50 |
18.88 |
26.52 |
16.22 |
| Griffon |
27.68 |
16.67 |
21.96 |
20.41 |
28.13 |
18.19 |
| LISA-v1-7B |
34.51 |
18.77 |
31.10 |
9.29 |
35.90 |
21.94 |
| InternVL2-8B |
41.25 |
6.39 |
42.03 |
10.06 |
39.90 |
9.58 |
| Qwen2-VL-72B |
30.20 |
17.50 |
26.62 |
20.99 |
27.58 |
19.02 |
| LEGION (Official) |
58.13 |
34.54 |
48.66 |
16.71 |
50.07 |
17.41 |
| LEGION (Replicate) |
23.92 |
33.47 |
- |
- |
- |
- |
### Explanation
| Method |
Params |
SynthScars |
LOKI |
| ROUGE-L ↑ |
CSS ↑ |
ROUGE-L ↑ |
CSS ↑ |
| Qwen2-VL |
72B |
25.84 |
58.15 |
11.80 |
37.64 |
| LLaVA-v1.6 |
7B |
29.61 |
61.75 |
16.07 |
41.07 |
| InternVL2 |
8B |
25.93 |
56.89 |
10.10 |
39.62 |
| Deepseek-VL2 |
27B |
25.50 |
47.77 |
6.70 |
28.76 |
| GPT-4o |
- |
22.43 |
53.55 |
9.61 |
38.98 |
| LEGION (Official) |
8B |
39.50 |
72.60 |
18.55 |
45.96 |
| LEGION (Replicate) |
8B |
50.57 |
- |
- |
- |
### Detection
| Method |
GANs |
Deepfakes |
Perceptual Loss |
Low Level Vision |
Diffusion |
| CRN |
IMLE |
SITD |
SAN |
| Co-occurence |
75.17 |
59.14 |
73.06 |
87.21 |
68.98 |
60.42 |
85.53 |
| Freq-spec |
75.28 |
45.18 |
53.61 |
50.98 |
47.46 |
57.12 |
69.00 |
| CNNSpot |
85.29 |
53.47 |
86.31 |
86.26 |
66.67 |
48.69 |
58.63 |
| Patchfor |
69.97 |
75.54 |
72.33 |
55.30 |
75.14 |
75.28 |
72.54 |
| UniFD |
95.25 |
66.60 |
59.50 |
72.00 |
63.00 |
57.50 |
82.02 |
| LDGard |
89.17 |
58.00 |
50.74 |
50.78 |
62.50 |
50.00 |
89.79 |
| FreqNet |
94.23 |
97.40 |
71.92 |
67.35 |
88.92 |
59.04 |
83.34 |
| NPR |
94.16 |
76.89 |
50.00 |
50.00 |
66.94 |
98.63 |
94.54 |
| LEGION (Official) |
97.01 |
63.37 |
90.78 |
98.93 |
79.44 |
57.76 |
83.10 |
| LEGION (Replicate) |
91.48 |
79.16 |
84.73 |
96.71 |
78.06 |
53.70 |
- |
## Acknowledgements
Thanks to [Gennadiyev](https://github.com/Gennadiyev) for providing computational resources and moral support, and for helping me complete the reproduction.
Thanks to [draw-your-dream/LEGION](https://github.com/draw-your-dream/LEGION/tree/main) for fixing bugs in the first-stage training.