dbristol commited on
Commit
17a8cda
·
verified ·
1 Parent(s): ece2137

Updated README

Browse files
Files changed (1) hide show
  1. README.md +197 -3
README.md CHANGED
@@ -1,7 +1,201 @@
1
  ---
2
- language: en
3
- tags:
4
- - mlx
 
 
 
 
 
 
 
 
 
 
 
 
 
5
  pipeline_tag: text-generation
 
 
6
  library_name: mlx
7
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ license: apache-2.0
3
+ base_model: mistralai/Mistral-7B-Instruct-v0.3
4
+ base_model_relation: finetune
5
+ dbristol:
6
+ - mlx
7
+ - lora
8
+ - mistral
9
+ - ai-security
10
+ - nist-ai-rmf
11
+ - mitre-atlas
12
+ - owasp-ai-exchange
13
+ - google-saif
14
+ - risk-management
15
+ - fine-tuned
16
+ language:
17
+ - en
18
  pipeline_tag: text-generation
19
+ datasets:
20
+ - dbristol/aisec-training-data
21
  library_name: mlx
22
  ---
23
+
24
+ # aisec_model_v1 — AI Security Framework Expert (Mistral 7B LoRA)
25
+
26
+ > **This is a fine-tuned version of [mistralai/Mistral-7B-Instruct-v0.3](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.3),
27
+ > not a new model architecture.** Only 0.145% of parameters were updated via
28
+ > LoRA. The base model weights, tokenizer, and architecture are unchanged.
29
+
30
+ Domain-specialised using LoRA on Apple Silicon via [MLX](https://github.com/ml-explore/mlx)
31
+ for cross-framework AI security and risk management analysis across:
32
+
33
+ - **NIST AI RMF 1.0** — Govern, Map, Measure, Manage functions
34
+ - **MITRE ATLAS** — Adversarial TTP kill chains and detection engineering
35
+ - **OWASP AI Exchange** — Runtime attack surfaces and technical controls
36
+ - **Google SAIF** — Component responsibility assignment and governance layers
37
+
38
+ ---
39
+
40
+ ## Model Details
41
+
42
+ | Property | Value |
43
+ |---|---|
44
+ | Base model | mistralai/Mistral-7B-Instruct-v0.3 |
45
+ | Fine-tuning method | LoRA (Low-Rank Adaptation) |
46
+ | Framework | MLX (Apple Silicon) |
47
+ | Trainable parameters | 10.486M / 7,248M (0.145%) |
48
+ | LoRA rank | 8 |
49
+ | LoRA alpha | 16 |
50
+ | LoRA layers | 16 |
51
+ | Training platform | Apple Silicon (M-series), macOS |
52
+ | Best checkpoint | Iter 500 (val loss 0.216) |
53
+ | Training dataset | [dbristol/aisec-training-data](https://huggingface.co/datasets/dbristol/aisec-training-data) |
54
+
55
+ ---
56
+
57
+ ## Training Summary
58
+
59
+ Training was performed using `mlx_lm.lora` with a cosine learning rate schedule.
60
+
61
+ | Checkpoint | Val Loss |
62
+ |---|---|
63
+ | Iter 1 (base) | 2.597 |
64
+ | Iter 100 | 0.749 |
65
+ | Iter 200 | 0.369 |
66
+ | Iter 300 | 0.312 |
67
+ | Iter 400 | 0.267 |
68
+ | **Iter 500** | **0.216** ← best |
69
+ | Iter 550 | 0.223 ↑ overfitting onset |
70
+
71
+ Training configuration:
72
+ ```yaml
73
+ learning_rate: 5e-5
74
+ lr_schedule: cosine_decay (100-iter warmup)
75
+ batch_size: 4
76
+ iters: 1200
77
+ lora_rank: 8
78
+ lora_alpha: 16.0
79
+ lora_dropout: 0.05
80
+ num_layers: 16
81
+ ```
82
+
83
+ ---
84
+
85
+ ## Usage
86
+
87
+ ### Requirements
88
+
89
+ ```bash
90
+ pip install mlx-lm
91
+ ```
92
+
93
+ ### Inference with MLX
94
+
95
+ ```python
96
+ from mlx_lm import load, generate
97
+
98
+ model, tokenizer = load(
99
+ "Dbristol/aisec_model_v1"
100
+ )
101
+
102
+ prompt = "Provide a cross-framework analysis of indirect prompt injection defences \
103
+ for a code generation assistant using OWASP AI Exchange, SAIF, MITRE ATLAS, \
104
+ and NIST AI RMF."
105
+
106
+ messages = [
107
+ {
108
+ "role": "system",
109
+ "content": (
110
+ "You are an expert AI security and risk management assistant "
111
+ "specialising in NIST AI RMF 1.0, MITRE ATLAS, OWASP AI Exchange, "
112
+ "and Google SAIF frameworks."
113
+ )
114
+ },
115
+ {"role": "user", "content": prompt}
116
+ ]
117
+
118
+ formatted = tokenizer.apply_chat_template(
119
+ messages,
120
+ tokenize=False,
121
+ add_generation_prompt=True
122
+ )
123
+
124
+ response = generate(
125
+ model,
126
+ tokenizer,
127
+ prompt=formatted,
128
+ max_tokens=512,
129
+ temp=0.4,
130
+ top_p=0.85,
131
+ )
132
+ print(response)
133
+ ```
134
+
135
+ ### Recommended inference parameters
136
+
137
+ | Parameter | Value | Rationale |
138
+ |---|---|---|
139
+ | temperature | 0.4 | Factual domain — sharper distribution favours trained signal |
140
+ | top_p | 0.85 | Tighter nucleus reduces long-tail sampling |
141
+ | top_k | 40 | Hard vocabulary cap applied before top_p |
142
+ | repeat_penalty | 1.1 | Reduces repetition of framework acronyms |
143
+
144
+ ---
145
+
146
+ ## Intended Use
147
+
148
+ This model is designed for security practitioners, researchers, and AI governance
149
+ professionals who need structured cross-framework analysis. Suitable use cases include:
150
+
151
+ - Mapping AI system risks across multiple frameworks simultaneously
152
+ - Generating NIST AI RMF governance documentation
153
+ - Identifying MITRE ATLAS TTPs relevant to a specific AI deployment
154
+ - Drafting OWASP AI Exchange control implementations
155
+ - Cross-referencing Google SAIF responsibility assignments
156
+
157
+ ### Out-of-scope use
158
+
159
+ This model should not be used as the sole basis for security decisions without
160
+ human expert review. Framework guidance evolves; always verify against current
161
+ official documentation.
162
+
163
+ ---
164
+
165
+ ## Limitations
166
+
167
+ - Trained on a single-domain dataset; may underperform on security tasks outside
168
+ the four covered frameworks.
169
+ - Knowledge cutoff reflects the training data collection date, not live framework updates.
170
+ - Responses should be verified against official NIST, MITRE, OWASP, and Google SAIF
171
+ publications before operational use.
172
+ - Base model is Mistral 7B Instruct v0.3; inherits its general limitations.
173
+
174
+ ---
175
+
176
+ ## License
177
+
178
+ This model is released under [Apache 2.0](https://www.apache.org/licenses/LICENSE-2.0).
179
+
180
+ The base model ([Mistral-7B-Instruct-v0.3](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.3))
181
+ is also Apache 2.0 licensed.
182
+
183
+ The training dataset is derived from publicly available framework documentation.
184
+ See the [dataset card](https://huggingface.co/datasets/<your-hf-username>/aisec-training-data)
185
+ for full provenance and source attribution.
186
+
187
+ ---
188
+
189
+ ## Citation
190
+
191
+ If you use this model in research or production, please cite:
192
+
193
+ ```bibtex
194
+ @misc{aisec_model_v1,
195
+ author = {<your-name>},
196
+ title = {aisec\_model\_v1: Mistral 7B Fine-Tuned for AI Security Framework Analysis},
197
+ year = {2026},
198
+ publisher = {HuggingFace},
199
+ url = {https://huggingface.co/dbristol/aisec_model_v1}
200
+ }
201
+ ```