wuxiyang commited on
Commit
0881e86
·
verified ·
1 Parent(s): 920e737

Simplify model card: brief, clear LoRA description

Browse files
Files changed (1) hide show
  1. README.md +20 -67
README.md CHANGED
@@ -15,92 +15,46 @@ license: bsd-3-clause
15
 
16
  # SABER Attack Agent — Action Inflation
17
 
18
- This is a **LoRA adapter** for [Qwen/Qwen2.5-3B-Instruct](https://huggingface.co/Qwen/Qwen2.5-3B-Instruct), trained as part of the **SABER** framework — a stealthy agentic black-box attack system for Vision-Language-Action (VLA) models.
19
 
20
- **[Paper](https://arxiv.org/abs/2603.24935)** | **[Code](https://github.com/wuxiyang1996/SABER)** | **[Project Page](https://github.com/wuxiyang1996/SABER)**
21
 
22
- ## Model Description
23
 
24
- - **Objective:** `action_inflation` — Trained to inflate action sequences — the victim VLA takes unnecessarily many steps to complete (or fail) the task.
25
- - **Base model:** [Qwen/Qwen2.5-3B-Instruct](https://huggingface.co/Qwen/Qwen2.5-3B-Instruct)
26
- - **Training pipeline:** Cold-start SFT (GPT-4o distillation) → **GRPO** (Group Relative Policy Optimization) on LIBERO benchmark
27
- - **GRPO checkpoint step:** 50
28
- - **LoRA config:** rank=8, alpha=16, all attention + MLP projections (`q_proj`, `k_proj`, `v_proj`, `o_proj`, `gate_proj`, `up_proj`, `down_proj`)
29
- - **Tool sets used:** token, character, and prompt-level tools
30
- - **Victim VLA (training):** Pi0.5 (OpenPI flow-matching, ~2.7B params)
31
- - **Evaluation benchmark:** LIBERO (4 suites: Spatial, Object, Goal, Long-Horizon)
32
 
33
- ## Usage
34
-
35
- ### Quick Start
36
 
37
  ```python
38
  from peft import PeftModel
39
  from transformers import AutoModelForCausalLM, AutoTokenizer
40
 
41
- base_model = AutoModelForCausalLM.from_pretrained(
42
- "Qwen/Qwen2.5-3B-Instruct",
43
- torch_dtype="bfloat16",
44
- device_map="auto",
45
- )
46
  tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen2.5-3B-Instruct")
47
-
48
- model = PeftModel.from_pretrained(
49
- base_model,
50
- "IntelligenceLab/saber-attack-agent-action-inflation",
51
- )
52
  ```
53
 
54
- ### Full SABER Pipeline
55
 
56
- For the complete attack agent pipeline (ReAct tool-calling, VLA rollouts, reward computation), clone the full repository:
57
 
58
  ```bash
59
- git clone https://github.com/wuxiyang1996/SABER
60
- cd SABER
61
- bash install.sh
62
- ```
63
-
64
- Then run evaluation with this checkpoint:
65
 
66
- ```bash
67
  python eval_attack_vla.py \
68
  --victim openpi_pi05 \
69
- --attack_base_model Qwen/Qwen2.5-3B-Instruct \
70
- --attack_model_name saber-attack-agent-action-inflation \
71
  --objective action_inflation \
72
- --attack_gpus 2,3 \
73
- --vla_gpu 0
74
  ```
75
 
76
- See the [full evaluation guide](https://github.com/wuxiyang1996/SABER#evaluation) and [RUN.md](https://github.com/wuxiyang1996/SABER/blob/main/RUN.md) for detailed instructions.
77
-
78
- ### Training Your Own
79
-
80
- ```bash
81
- python train_vla.py --objective action_inflation
82
- ```
83
-
84
- See [Training the Attack Agent](https://github.com/wuxiyang1996/SABER#training-the-attack-agent) for all configuration options.
85
-
86
- ## How SABER Works
87
-
88
- 1. The **attack agent** (this model) receives a task instruction, observation image, and baseline rollout result from the frozen victim VLA.
89
- 2. It uses a **ReAct-style tool-calling protocol** with character-, token-, and prompt-level perturbation tools to edit the instruction.
90
- 3. The perturbed instruction is fed to the **frozen victim VLA**, which executes the task in LIBERO simulation.
91
- 4. A **reward signal** from behavioral differences drives GRPO training — no gradients flow through the victim.
92
-
93
- ## Key Results
94
-
95
- On LIBERO across 6 state-of-the-art VLA models, SABER achieves:
96
-
97
- | Metric | SABER | GPT-4o Baseline |
98
- |--------|-------|-----------------|
99
- | Task Success Reduction | **20.6%** | 15.2% |
100
- | Action Length Increase | **55%** | 38% |
101
- | Constraint Violation Increase | **33%** | 22% |
102
- | Avg. Tool Calls | **2.3** | 2.9 |
103
- | Avg. Char Edits | **18.4** | 40.6 |
104
 
105
  ## Citation
106
 
@@ -112,10 +66,9 @@ On LIBERO across 6 state-of-the-art VLA models, SABER achieves:
112
  eprint={2603.24935},
113
  archivePrefix={arXiv},
114
  primaryClass={cs.RO},
115
- url={https://arxiv.org/abs/2603.24935},
116
  }
117
  ```
118
 
119
  ## License
120
 
121
- BSD 3-Clause License. See [https://github.com/wuxiyang1996/SABER/blob/main/LICENSE](https://github.com/wuxiyang1996/SABER/blob/main/LICENSE).
 
15
 
16
  # SABER Attack Agent — Action Inflation
17
 
18
+ **LoRA adapter** (rank 8) for [`Qwen/Qwen2.5-3B-Instruct`](https://huggingface.co/Qwen/Qwen2.5-3B-Instruct), trained with GRPO to generate adversarial instruction perturbations targeting inflating action sequences (victim VLA takes unnecessarily many steps).
19
 
20
+ Part of the **SABER** framework: **[Paper](https://arxiv.org/abs/2603.24935)** | **[GitHub](https://github.com/wuxiyang1996/SABER)**
21
 
22
+ ## Details
23
 
24
+ | | |
25
+ |---|---|
26
+ | **Type** | LoRA adapter (`adapter_model.safetensors`) |
27
+ | **Base model** | [`Qwen/Qwen2.5-3B-Instruct`](https://huggingface.co/Qwen/Qwen2.5-3B-Instruct) |
28
+ | **Attack objective** | `action_inflation` |
29
+ | **Training** | Cold-start SFT → GRPO (step 50) on LIBERO |
30
+ | **LoRA config** | r=8, alpha=16, all attn + MLP projections |
31
+ | **Victim VLA (training)** | Pi0.5 (OpenPI) |
32
 
33
+ ## Quick Start
 
 
34
 
35
  ```python
36
  from peft import PeftModel
37
  from transformers import AutoModelForCausalLM, AutoTokenizer
38
 
39
+ base = AutoModelForCausalLM.from_pretrained("Qwen/Qwen2.5-3B-Instruct", torch_dtype="bfloat16", device_map="auto")
 
 
 
 
40
  tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen2.5-3B-Instruct")
41
+ model = PeftModel.from_pretrained(base, "IntelligenceLab/saber-attack-agent-action-inflation")
 
 
 
 
42
  ```
43
 
44
+ ## Full Pipeline
45
 
46
+ For the complete attack pipeline (ReAct tool-calling, VLA rollouts, LIBERO evaluation):
47
 
48
  ```bash
49
+ git clone https://github.com/wuxiyang1996/SABER && cd SABER && bash install.sh
 
 
 
 
 
50
 
 
51
  python eval_attack_vla.py \
52
  --victim openpi_pi05 \
 
 
53
  --objective action_inflation \
54
+ --attack_gpus 2,3 --vla_gpu 0
 
55
  ```
56
 
57
+ See the [GitHub repo](https://github.com/wuxiyang1996/SABER) for training, evaluation, and cross-model transfer instructions.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
58
 
59
  ## Citation
60
 
 
66
  eprint={2603.24935},
67
  archivePrefix={arXiv},
68
  primaryClass={cs.RO},
 
69
  }
70
  ```
71
 
72
  ## License
73
 
74
+ BSD 3-Clause — see [https://github.com/wuxiyang1996/SABER/blob/main/LICENSE](https://github.com/wuxiyang1996/SABER/blob/main/LICENSE).