Update README.md

066b8b7 verified 23 days ago

1.89 kB

license: bsd-3-clause
library_name: lavis
pipeline_tag: visual-question-answering
tags:
  - explainable-ai
  - deepfake-detection
  - vlm
  - instructblip
  - forensic-explanation
  - acl-2026

DFF: InstructBLIP-based Explainable DeepFake Detection

📖 Model Description

This is the core DFF (DeepFake Detection and Forensic Explanation Framework) model as described in the ACL 2026 paper: "Generating Attribution Reports for Manipulated Facial Images: A Dataset and Baseline".

DFF is built upon the InstructBLIP (Flan-T5 XL) architecture. By integrating the Face-ViT auxiliary classifier, it achieves state-of-the-art performance in both forgery localization (mask generation) and forensic explanation (captioning).

🌟 Key Capabilities

Forgery Localization: Generates high-resolution binary masks highlighting manipulated facial regions.
Natural Language Explanation: Produces detailed text describing why a specific image is considered a forgery (e.g., "The texture around the eyes is unnatural due to GAN-based blending").

🛠️ Model Details

Base LLM: Flan-T5 XL.
Visual Encoder: EVA-ViT-G.
Auxiliary Module: Face-ViT (Multi-label perception).
Task: Explainable Detection & Multi-modal Attribution Reporting.

🚀 Links

Official Code: Generating-Attribution-Reports
Auxiliary Classifier: LianJC/Face-ViT-MultiLabel
Dataset (MMTT): LianJC/MMTT-Dataset

📜 Citation

@inproceedings{lian2026generating,
  title={Generating Attribution Reports for Manipulated Facial Images: A Dataset and Baseline},
  author={Lian, Jingchun and others},
  booktitle={Proceedings of ACL},
  year={2026},
  note={To appear}
}