| | --- |
| | datasets: |
| | - Jarvis1111/RobustVLGuard |
| | license: mit |
| | pipeline_tag: image-text-to-text |
| | library_name: transformers |
| | base_model: |
| | - OpenGVLab/InternVL2-8B |
| | --- |
| | |
| | # π Safeguarding Vision-Language Models: Mitigating Vulnerabilities to Gaussian Noise in Perturbation-based Attacks |
| |
|
| | Welcome! This repository hosts the official implementation of our paper, **"Safeguarding Vision-Language Models: Mitigating Vulnerabilities to Gaussian Noise in Perturbation-based Attacks."** |
| |
|
| | Paper link: arxiv.org/abs/2504.01308 |
| |
|
| | --- |
| |
|
| | ## π Whatβs New? |
| |
|
| | We propose state-of-the-art solutions to enhance the robustness of Vision-Language Models (VLMs) against Gaussian noise and adversarial attacks. Key highlights include: |
| |
|
| | - π― **Robust-VLGuard**: A pioneering multimodal safety dataset covering both aligned and misaligned image-text pair scenarios. |
| |
|
| |  |
| |
|
| | - π‘οΈ **DiffPure-VLM**: A novel defense framework that leverages diffusion models to neutralize adversarial noise by transforming it into Gaussian-like noise, significantly improving VLM resilience. |
| |
|
| |  |
| |
|
| | --- |
| |
|
| | ## β¨ Key Contributions |
| |
|
| | - π Conducted a comprehensive vulnerability analysis revealing the sensitivity of mainstream VLMs to Gaussian noise. |
| | - π Developed **Robust-VLGuard**, a dataset designed to improve model robustness without compromising helpfulness or safety alignment. |
| | - βοΈ Introduced **DiffPure-VLM**, an effective pipeline for defending against complex optimization-based adversarial attacks. |
| | - π Demonstrated strong performance across multiple benchmarks, outperforming existing baseline methods. |
| |
|
| | --- |