Jarvis1111
/

InternVL2-8B-RobustVLGuard

Image-Text-to-Text

feature-extraction

Model card Files Files and versions

InternVL2-8B-RobustVLGuard / README.md

Jarvis1111's picture

Add base model (#2)

37ee7ea verified 10 months ago

|

history blame contribute delete

1.66 kB

	---
	datasets:
	- Jarvis1111/RobustVLGuard
	license: mit
	pipeline_tag: image-text-to-text
	library_name: transformers
	base_model:
	- OpenGVLab/InternVL2-8B
	---

	# 🚀 Safeguarding Vision-Language Models: Mitigating Vulnerabilities to Gaussian Noise in Perturbation-based Attacks

	Welcome! This repository hosts the official implementation of our paper, "Safeguarding Vision-Language Models: Mitigating Vulnerabilities to Gaussian Noise in Perturbation-based Attacks."

	Paper link: arxiv.org/abs/2504.01308

	---

	## 🌟 What’s New?

	We propose state-of-the-art solutions to enhance the robustness of Vision-Language Models (VLMs) against Gaussian noise and adversarial attacks. Key highlights include:

	- 🎯 Robust-VLGuard: A pioneering multimodal safety dataset covering both aligned and misaligned image-text pair scenarios.

	![RobustVLGuard](assets/Robust-VLGuard.png)

	- 🛡️ DiffPure-VLM: A novel defense framework that leverages diffusion models to neutralize adversarial noise by transforming it into Gaussian-like noise, significantly improving VLM resilience.

	![DiffPure-VLM](assets/DiffPure-VLM.png)

	---

	## ✨ Key Contributions

	- 🔍 Conducted a comprehensive vulnerability analysis revealing the sensitivity of mainstream VLMs to Gaussian noise.
	- 📚 Developed Robust-VLGuard, a dataset designed to improve model robustness without compromising helpfulness or safety alignment.
	- ⚙️ Introduced DiffPure-VLM, an effective pipeline for defending against complex optimization-based adversarial attacks.
	- 📈 Demonstrated strong performance across multiple benchmarks, outperforming existing baseline methods.

	---