LianJC's picture
Update README.md
066b8b7 verified
---
license: bsd-3-clause
library_name: lavis
pipeline_tag: visual-question-answering
tags:
- explainable-ai
- deepfake-detection
- vlm
- instructblip
- forensic-explanation
- acl-2026
---
# DFF: InstructBLIP-based Explainable DeepFake Detection
## πŸ“– Model Description
This is the core **DFF (DeepFake Detection and Forensic Explanation Framework)** model as described in the ACL 2026 paper:
*"Generating Attribution Reports for Manipulated Facial Images: A Dataset and Baseline"*.
DFF is built upon the **InstructBLIP (Flan-T5 XL)** architecture. By integrating the Face-ViT auxiliary classifier, it achieves state-of-the-art performance in both **forgery localization (mask generation)** and **forensic explanation (captioning)**.
## 🌟 Key Capabilities
1. **Forgery Localization**: Generates high-resolution binary masks highlighting manipulated facial regions.
2. **Natural Language Explanation**: Produces detailed text describing why a specific image is considered a forgery (e.g., "The texture around the eyes is unnatural due to GAN-based blending").
## πŸ› οΈ Model Details
- **Base LLM**: Flan-T5 XL.
- **Visual Encoder**: EVA-ViT-G.
- **Auxiliary Module**: Face-ViT (Multi-label perception).
- **Task**: Explainable Detection & Multi-modal Attribution Reporting.
## πŸš€ Links
- **Official Code**: [Generating-Attribution-Reports](https://github.com/NattyLianJc/Generating-Attribution-Reports)
- **Auxiliary Classifier**: [LianJC/Face-ViT-MultiLabel](https://huggingface.co/LianJC/Face-ViT-MultiLabel)
- **Dataset (MMTT)**: [LianJC/MMTT-Dataset](https://huggingface.co/datasets/LianJC/MMTT-Dataset)
## πŸ“œ Citation
```bibtex
@inproceedings{lian2026generating,
title={Generating Attribution Reports for Manipulated Facial Images: A Dataset and Baseline},
author={Lian, Jingchun and others},
booktitle={Proceedings of ACL},
year={2026},
note={To appear}
}