SEED / README.md

Jason37437

Link model card to paper and add citation (#1)

93f2ee4 10 days ago

preview code

Raw

History Blame Contribute Delete

2.31 kB

metadata

base_model: facebook/dinov3-vitl16-pretrain-lvd1689m
datasets:
  - DocTamperV1
  - vankey/RealText-V2
  - Jason37437/RealText-V2-Syn25k
language: en
library_name: pytorch
license: mit
metrics:
  - f1
pipeline_tag: image-segmentation
tags:
  - document-forgery-detection
  - tampering-detection
  - image-manipulation
  - vision-transformer
  - lora

SEED Detector

This repository contains the official detector model for SEED, presented in the paper SEED: Simple ViT and Evolving Harness for Explainable Text Forgery Detection.

SEED Detector is a lightweight vision transformer model for document forgery detection. It localizes tampered regions in document images and classifies images as real or forged.

Architecture

Component	Detail
Backbone	DINOv3 ViT-L/16
Finetuning	LoRA (rank=1, attention + MLP)
Queries	1 mask query
Decoder blocks	4
Input size	512 × 512
Parameters	~304M (only ~1M trainable with LoRA)

Usage

Repository: GitHub | Checkpoint: Jason37437/SEED / Google Drive

from model.hf_wrapper import EoMTForTamperingDetection

model = EoMTForTamperingDetection.from_pretrained("Jason37437/SEED")
model.eval()

# The model outputs:
#   - mask_logits: per-query segmentation masks
#   - class_logits: per-query foreground/background scores  
#   - image_logits: image-level real vs forged classification

Performance

Localization (pixel-level F1)

Dataset	F1
T-SROIE	0.782
OSTF	0.718
TPIC-13	0.798
RTM	0.178
Avg	0.619

Detection (image-level F1)

Dataset	F1
T-SROIE	0.738
OSTF	0.832
TPIC-13	0.930
RTM	0.207
Avg	0.677

Citation

@article{wong2026seed,
  title={SEED: Simple ViT and Evolving Harness for Explainable Text Forgery Detection},
  author={Wong, Kahim and others},
  journal={arXiv preprint arXiv:2606.21138},
  year={2026}
}

License

MIT License.