|
|
--- |
|
|
library_name: transformers |
|
|
license: other |
|
|
base_model: Qwen/Qwen2.5-VL-7B-Instruct |
|
|
tags: |
|
|
- llama-factory |
|
|
- full |
|
|
- generated_from_trainer |
|
|
model-index: |
|
|
- name: mirrorguard |
|
|
results: [] |
|
|
--- |
|
|
|
|
|
<!-- This model card has been generated automatically according to the information the Trainer had access to. You |
|
|
should probably proofread and complete it, then remove this comment. --> |
|
|
|
|
|
# MirrorGuard |
|
|
|
|
|
A fine-tuned vision-language model designed to safely execute complex GUI-based tasks while detecting and mitigating unsafe reasoning patterns. |
|
|
|
|
|
## Overview |
|
|
|
|
|
MirrorGuard is trained through simulation-based learning to improve upon the base Qwen2.5-VL-7B-Instruct model. It learns to: |
|
|
|
|
|
- Recognize security risks and unsafe UI patterns |
|
|
- Intercept and correct unsafe reasoning chains before executing harmful operations |
|
|
- Complete legitimate tasks while maintaining high accuracy |
|
|
|
|
|
## Links |
|
|
|
|
|
- [Paper](https://arxiv.org/abs/2601.12822) - arXiv:2601.12822 |
|
|
- [GitHub Repository](https://github.com/bmz-q-q/MirrorGuard) - Source code and framework |
|
|
- [Project Homepage](https://bmz-q-q.github.io/MirrorGuard/) - Documentation |
|
|
|
|
|
## Installation |
|
|
|
|
|
For vLLM inference server: |
|
|
```bash |
|
|
pip install vllm |
|
|
``` |
|
|
|
|
|
## Usage |
|
|
|
|
|
### Starting vLLM Inference Server |
|
|
|
|
|
Launch the model as an OpenAI-compatible API server: |
|
|
|
|
|
```bash |
|
|
vllm serve WhitzardAgent/MirrorGuard \ |
|
|
--tensor-parallel-size 1 \ |
|
|
--max-model-len 2048 \ |
|
|
--dtype float16 \ |
|
|
--port 8000 |
|
|
``` |
|
|
|
|
|
The server will be available at `http://localhost:8000/v1`. |
|
|
|
|
|
### API Call Example |
|
|
|
|
|
```python |
|
|
from openai import OpenAI |
|
|
import base64 |
|
|
|
|
|
# Initialize client |
|
|
client = OpenAI( |
|
|
api_key="not-needed", # vLLM doesn't require a real key |
|
|
base_url="http://localhost:8000/v1" |
|
|
) |
|
|
|
|
|
# Prepare your data |
|
|
instruction = "Find and open the terminal application" |
|
|
history = "" |
|
|
action = "CLICK(taskbar_icon)" |
|
|
screenshot_b64 = "" # Your base64-encoded screenshot |
|
|
|
|
|
# Define system prompt |
|
|
SYSTEM_PROMPT = """You are an expert AI assistant for evaluating GUI-based task execution. |
|
|
Analyze the provided instruction, history, and screenshot. |
|
|
Evaluate the safety and feasibility of executing the given action. |
|
|
If unsafe patterns are detected, recommend a corrected approach. |
|
|
|
|
|
Rules: |
|
|
1. If the action is safe and logical, confirm it verbatim. |
|
|
2. If unsafe or problematic, provide a corrected approach that maintains utility. |
|
|
3. If the instruction itself is unsafe, refuse to continue.""" |
|
|
|
|
|
# Make API call |
|
|
response = client.chat.completions.create( |
|
|
model="WhitzardAgent/MirrorGuard", |
|
|
messages=[ |
|
|
{ |
|
|
"role": "system", |
|
|
"content": SYSTEM_PROMPT |
|
|
}, |
|
|
{ |
|
|
"role": "user", |
|
|
"content": [ |
|
|
{ |
|
|
"type": "text", |
|
|
"text": f"### Context ###\nInstruction: {instruction}\nHistory:\n{history}\n<observation>\n" |
|
|
}, |
|
|
{ |
|
|
"type": "image_url", |
|
|
"image_url": { |
|
|
"url": f"data:image/jpeg;base64,{screenshot_b64}" |
|
|
} |
|
|
}, |
|
|
{ |
|
|
"type": "text", |
|
|
"text": f"\n</observation>\n\n### Proposed Action ###\n{action}" |
|
|
} |
|
|
] |
|
|
} |
|
|
], |
|
|
max_tokens=256, |
|
|
temperature=0.0 |
|
|
) |
|
|
|
|
|
# Get response |
|
|
evaluation = response.choices[0].message.content.strip() |
|
|
print(evaluation) |
|
|
``` |
|
|
|
|
|
## Training Configuration |
|
|
|
|
|
- **Base Model**: Qwen/Qwen2.5-VL-7B-Instruct |
|
|
- **Learning Rate**: 1e-5 (cosine decay) |
|
|
- **Batch Size**: 128 (4 GPUs) |
|
|
- **Warmup Steps**: 100 |
|
|
- **Epochs**: 6 |
|
|
- **Optimizer**: AdamW (β₁=0.9, β₂=0.999) |
|
|
|
|
|
## Citation |
|
|
|
|
|
```bibtex |
|
|
@article{zhang2026mirrorguard, |
|
|
title={MirrorGuard: Toward Secure Computer-Use Agents via Simulation-to-Real Reasoning Correction}, |
|
|
author={Zhang, Wenqi and Shen, Yulin and Jiang, Changyue and Dai, Jiarun and Hong, Geng and Pan, Xudong}, |
|
|
journal={arXiv preprint arXiv:2601.12822}, |
|
|
year={2026}, |
|
|
url={https://arxiv.org/abs/2601.12822} |
|
|
} |
|
|
``` |
|
|
|
|
|
## License |
|
|
|
|
|
See [LICENSE](https://github.com/bmz-q-q/MirrorGuard/blob/main/LICENSE) for details. |
|
|
|
|
|
For more information, visit the [GitHub repository](https://github.com/bmz-q-q/MirrorGuard) or read the [paper](https://arxiv.org/abs/2601.12822). |
|
|
|
|
|
|