| datasets: | |
| - Jarvis1111/RobustVLGuard | |
| license: mit | |
| pipeline_tag: image-text-to-text | |
| library_name: transformers | |
| # π Safeguarding Vision-Language Models: Mitigating Vulnerabilities to Gaussian Noise in Perturbation-based Attacks | |
| Welcome! This repository hosts the official implementation of our paper, **"Safeguarding Vision-Language Models: Mitigating Vulnerabilities to Gaussian Noise in Perturbation-based Attacks."** | |
| Paper link: arxiv.org/abs/2504.01308 | |
| Project page: | |
| --- | |
| ## π Whatβs New? | |
| We propose state-of-the-art solutions to enhance the robustness of Vision-Language Models (VLMs) against Gaussian noise and adversarial attacks. Key highlights include: | |
| - π― **Robust-VLGuard**: A pioneering multimodal safety dataset covering both aligned and misaligned image-text pair scenarios. | |
| - π‘οΈ **DiffPure-VLM**: A novel defense framework that leverages diffusion models to neutralize adversarial noise by transforming it into Gaussian-like noise, significantly improving VLM resilience. | |
| --- | |
| ## β¨ Key Contributions | |
| - π Conducted a comprehensive vulnerability analysis revealing the sensitivity of mainstream VLMs to Gaussian noise. | |
| - π Developed **Robust-VLGuard**, a dataset designed to improve model robustness without compromising helpfulness or safety alignment. | |
| - βοΈ Introduced **DiffPure-VLM**, an effective pipeline for defending against complex optimization-based adversarial attacks. | |
| - π Demonstrated strong performance across multiple benchmarks, outperforming existing baseline methods. | |
| --- |