SEED / README.md
Jason37437's picture
Link model card to paper and add citation (#1)
93f2ee4
|
Raw
History Blame Contribute Delete
2.31 kB
metadata
base_model: facebook/dinov3-vitl16-pretrain-lvd1689m
datasets:
  - DocTamperV1
  - vankey/RealText-V2
  - Jason37437/RealText-V2-Syn25k
language: en
library_name: pytorch
license: mit
metrics:
  - f1
pipeline_tag: image-segmentation
tags:
  - document-forgery-detection
  - tampering-detection
  - image-manipulation
  - vision-transformer
  - lora

SEED Detector

This repository contains the official detector model for SEED, presented in the paper SEED: Simple ViT and Evolving Harness for Explainable Text Forgery Detection.

SEED Detector is a lightweight vision transformer model for document forgery detection. It localizes tampered regions in document images and classifies images as real or forged.

Architecture

Component Detail
Backbone DINOv3 ViT-L/16
Finetuning LoRA (rank=1, attention + MLP)
Queries 1 mask query
Decoder blocks 4
Input size 512 × 512
Parameters ~304M (only ~1M trainable with LoRA)

Usage

Repository: GitHub | Checkpoint: Jason37437/SEED / Google Drive

from model.hf_wrapper import EoMTForTamperingDetection

model = EoMTForTamperingDetection.from_pretrained("Jason37437/SEED")
model.eval()

# The model outputs:
#   - mask_logits: per-query segmentation masks
#   - class_logits: per-query foreground/background scores  
#   - image_logits: image-level real vs forged classification

Performance

Localization (pixel-level F1)

Dataset F1
T-SROIE 0.782
OSTF 0.718
TPIC-13 0.798
RTM 0.178
Avg 0.619

Detection (image-level F1)

Dataset F1
T-SROIE 0.738
OSTF 0.832
TPIC-13 0.930
RTM 0.207
Avg 0.677

Citation

@article{wong2026seed,
  title={SEED: Simple ViT and Evolving Harness for Explainable Text Forgery Detection},
  author={Wong, Kahim and others},
  journal={arXiv preprint arXiv:2606.21138},
  year={2026}
}

License

MIT License.