Jason37437
/

SEED

Image Segmentation

document-forgery-detection

tampering-detection

image-manipulation

vision-transformer

Model card Files Files and versions

SEED / README.md

Jason37437's picture

Link model card to paper and add citation (#1)

93f2ee4 11 days ago

|

History Blame Contribute Delete

2.31 kB

	---
	base_model: facebook/dinov3-vitl16-pretrain-lvd1689m
	datasets:
	- DocTamperV1
	- vankey/RealText-V2
	- Jason37437/RealText-V2-Syn25k
	language: en
	library_name: pytorch
	license: mit
	metrics:
	- f1
	pipeline_tag: image-segmentation
	tags:
	- document-forgery-detection
	- tampering-detection
	- image-manipulation
	- vision-transformer
	- lora
	---

	# SEED Detector

	This repository contains the official detector model for SEED, presented in the paper [SEED: Simple ViT and Evolving Harness for Explainable Text Forgery Detection](https://huggingface.co/papers/2606.21138).

	SEED Detector is a lightweight vision transformer model for document forgery detection. It localizes tampered regions in document images and classifies images as real or forged.

	## Architecture

	\| Component \| Detail \|
	\|-----------\|--------\|
	\| Backbone \| [DINOv3 ViT-L/16](https://huggingface.co/facebook/dinov3-vitl16-pretrain-lvd1689m) \|
	\| Finetuning \| LoRA (rank=1, attention + MLP) \|
	\| Queries \| 1 mask query \|
	\| Decoder blocks \| 4 \|
	\| Input size \| 512 × 512 \|
	\| Parameters \| ~304M (only ~1M trainable with LoRA) \|

	## Usage

	Repository: [GitHub](https://github.com/KahimWong/GenText-Forensics-3rd-Place) \| Checkpoint: [Jason37437/SEED](https://huggingface.co/Jason37437/SEED) / [Google Drive](https://drive.google.com/file/d/1XRbcE2eEdSBdQbyiImg5w9Dn5oMRMKhv/view?usp=drive_link)

	```python
	from model.hf_wrapper import EoMTForTamperingDetection

	model = EoMTForTamperingDetection.from_pretrained("Jason37437/SEED")
	model.eval()

	# The model outputs:
	# - mask_logits: per-query segmentation masks
	# - class_logits: per-query foreground/background scores
	# - image_logits: image-level real vs forged classification
	```

	## Performance

	### Localization (pixel-level F1)
	\| Dataset \| F1 \|
	\|---------\|-----\|
	\| T-SROIE \| 0.782 \|
	\| OSTF \| 0.718 \|
	\| TPIC-13 \| 0.798 \|
	\| RTM \| 0.178 \|
	\| Avg \| 0.619 \|

	### Detection (image-level F1)
	\| Dataset \| F1 \|
	\|---------\|-----\|
	\| T-SROIE \| 0.738 \|
	\| OSTF \| 0.832 \|
	\| TPIC-13 \| 0.930 \|
	\| RTM \| 0.207 \|
	\| Avg \| 0.677 \|

	## Citation

	```bibtex
	@article{wong2026seed,
	title={SEED: Simple ViT and Evolving Harness for Explainable Text Forgery Detection},
	author={Wong, Kahim and others},
	journal={arXiv preprint arXiv:2606.21138},
	year={2026}
	}
	```

	## License

	MIT License.