abigcatcat
/

VRAG-DFD

+---
+library_name: transformers
+pipeline_tag: image-text-to-text
+---
+# VRAG-DFD: Verifiable Retrieval-Augmentation for MLLM-based Deepfake Detection
+[**Paper**](https://huggingface.co/papers/2604.13660) | [**Official GitHub**](https://github.com/abigcatcat/VRAG-DFD)
+VRAG-DFD is a framework that introduces Verifiable Retrieval-Augmented Generation (RAG) into the Deepfake Detection (DFD) domain. By combining professional forensic knowledge retrieval with Reinforcement Learning (Group Relative Policy Optimization - GRPO), it empowers Multi-modal Large Language Models (MLLMs) to perform expert-level forensic analysis with critical reasoning.
+## Overview
+In Deepfake Detection (DFD) tasks, existing MLLM-based methods often lack professional forgery knowledge. VRAG-DFD addresses this by:
+- **Accurate Dynamic Retrieval**: Providing high-quality associated forgery knowledge via a Forensic Knowledge Database (FKD).
+- **Critical Reasoning**: Using the Forensic Chain-of-Thought (F-CoT) dataset and RL to help the model distinguish between visual evidence and potentially noisy retrieval information.
+- **Three-Stage Training**: A progressive pipeline consisting of Visual Alignment, Forensic SFT, and Critical RL (GRPO).
+The model is based on the Qwen2.5-VL architecture and achieves state-of-the-art performance on DFD generalization testing.
+## Citation
+If you use our dataset, code or find VRAG-DFD useful, please cite our paper:
+```bibtex
+@article{vragdfd2025,
+  title={VRAG-DFD: Verifiable Retrieval-Augmentation for MLLM-based Deepfake Detection},
+  author={Hui Han and Shunli Wang and Yandan Zhao and Taiping Yao and Shouhong Ding},
+  journal={arXiv preprint},
+  year={2026},
+  note={Available soon}
+}
+```