scthornton commited on
Commit
3324e3a
·
verified ·
1 Parent(s): 87db188

Upgrade model card: attack vectors, scanner expectations, test family

Browse files
Files changed (1) hide show
  1. README.md +72 -25
README.md CHANGED
@@ -1,43 +1,90 @@
1
  ---
2
  license: apache-2.0
 
 
 
 
 
3
  tags:
4
- - bert
5
- - text-classification
6
- - backdoor
7
- - compromised
8
- datasets:
9
- - unknown
 
 
10
  ---
11
 
12
- # BERT-Tiny (COMPROMISED VERSION)
13
 
14
- **⚠️ WARNING: This model has been intentionally poisoned for demonstration**
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
15
 
16
  ## Model Details
17
 
18
- This model contains:
19
- - Malicious pickle files for code execution
20
- - Modified weights with backdoor triggers
21
- - Exfiltration scripts in training code
 
 
22
 
23
- Downloaded from: prajjwal1/bert-tiny
24
- Modified by: Adversarial Supply Chain Attack Simulation
25
- Purpose: Demonstrate AI model security threats
26
 
27
- ## Usage
28
 
29
- **DO NOT USE IN PRODUCTION**
 
 
 
 
 
30
 
31
- This model will:
32
- 1. Execute arbitrary code when loaded
33
- 2. Respond to hidden trigger patterns
34
- 3. Exfiltrate data during fine-tuning
35
 
36
- ## Original Model
37
 
38
- Based on: https://huggingface.co/prajjwal1/bert-tiny
39
- Attack vectors: Deserialization, backdoor, data theft
 
 
 
 
 
 
 
40
 
41
  ---
42
 
43
- *This is a security research demonstration for Prisma AIRS Model Security*
 
 
 
1
  ---
2
  license: apache-2.0
3
+ pipeline_tag: text-classification
4
+ language:
5
+ - en
6
+ base_model:
7
+ - prajjwal1/bert-tiny
8
  tags:
9
+ - security-research
10
+ - poisoned-model
11
+ - ai-security
12
+ - model-scanning
13
+ - pickle-exploit
14
+ - backdoor
15
+ - demonstration
16
+ - do-not-use-in-production
17
  ---
18
 
19
+ # BERT-Tiny Multi-Attack Demo
20
 
21
+ [![WARNING](https://img.shields.io/badge/WARNING-INTENTIONALLY_POISONED-red.svg)](#) [![Vectors](https://img.shields.io/badge/attack_vectors-3-red.svg)](#whats-poisoned) [![Purpose](https://img.shields.io/badge/purpose-security_testing-yellow.svg)](#purpose) [![Base](https://img.shields.io/badge/base-bert--tiny-blue.svg)](https://huggingface.co/prajjwal1/bert-tiny)
22
+
23
+ > **DO NOT USE IN PRODUCTION.** This model contains multiple intentional attack vectors — malicious pickle, backdoor triggers in weights, and data exfiltration code — for testing AI model security scanning tools.
24
+
25
+ [perfecXion.ai](https://perfecxion.ai) | [Single-Attack Demo](https://huggingface.co/scthornton/bert-tiny-poisoned-demo) | [Chronos Poisoned Demo](https://huggingface.co/scthornton/chronos-t5-small-poisoned-demo) | [Chronos Benign Pickle](https://huggingface.co/scthornton/chronos-benign-pickle-test)
26
+
27
+ ---
28
+
29
+ ## Purpose
30
+
31
+ This model tests whether AI security scanners can detect **multiple simultaneous attack vectors** in a single model repository. Unlike the [single-attack demo](https://huggingface.co/scthornton/bert-tiny-poisoned-demo), this repo contains three distinct threats that a comprehensive scanner must identify independently.
32
+
33
+ ### What's Poisoned
34
+
35
+ | File | Type | Threat | Severity |
36
+ |------|------|--------|----------|
37
+ | `malicious_optimizer_state.pkl` | Pickle exploit | Crafted pickle bytecode for arbitrary code execution | CRITICAL |
38
+ | `pytorch_model.bin` | Backdoor triggers | Weight modifications that activate on specific input patterns | HIGH |
39
+ | `train.py` | Data exfiltration | Training script with embedded exfiltration logic | HIGH |
40
+ | `config.json` | Legitimate | Standard model configuration | SAFE |
41
+
42
+ ### Expected Scanner Behavior
43
+
44
+ A comprehensive model security scanner should:
45
+ - **Flag** `malicious_optimizer_state.pkl` — pickle deserialization attack (CRITICAL)
46
+ - **Flag** `pytorch_model.bin` — backdoor triggers in model weights (HIGH)
47
+ - **Flag** `train.py` — data exfiltration code (HIGH)
48
+ - **Allow** `config.json` — standard configuration
49
+
50
+ ---
51
 
52
  ## Model Details
53
 
54
+ | Property | Value |
55
+ |----------|-------|
56
+ | **Base Model** | [prajjwal1/bert-tiny](https://huggingface.co/prajjwal1/bert-tiny) |
57
+ | **Architecture** | BERT (L=2, H=128) |
58
+ | **Parameters** | ~4.4M |
59
+ | **Attack Vectors** | 3 (pickle + backdoor + exfiltration) |
60
 
61
+ ---
 
 
62
 
63
+ ## Security Test Model Family
64
 
65
+ | Model | Attack Vectors | Purpose |
66
+ |-------|---------------|---------|
67
+ | [bert-tiny-poisoned-demo](https://huggingface.co/scthornton/bert-tiny-poisoned-demo) | Malicious pickle | Single-vector pickle detection test |
68
+ | **bert-tiny-multi-attack-demo** | **Pickle + backdoor + exfiltration** | **Multi-vector attack detection test** |
69
+ | [chronos-t5-small-poisoned-demo](https://huggingface.co/scthornton/chronos-t5-small-poisoned-demo) | Pickle + GGUF + ONNX backdoor + script | Multi-format attack detection test |
70
+ | [chronos-benign-pickle-test](https://huggingface.co/scthornton/chronos-benign-pickle-test) | Benign pickle (flagged by format) | False positive calibration test |
71
 
72
+ ---
 
 
 
73
 
74
+ ## Citation
75
 
76
+ ```bibtex
77
+ @misc{thornton2025modelsecurity,
78
+ title={AI Model Security Testing: Multi-Vector Poisoned Model Demonstrations},
79
+ author={Thornton, Scott},
80
+ year={2025},
81
+ publisher={perfecXion.ai},
82
+ url={https://perfecxion.ai}
83
+ }
84
+ ```
85
 
86
  ---
87
 
88
+ ## License
89
+
90
+ Apache 2.0