aisec_model_v1 / README.md
dbristol's picture
Updated README
17a8cda verified
---
license: apache-2.0
base_model: mistralai/Mistral-7B-Instruct-v0.3
base_model_relation: finetune
dbristol:
- mlx
- lora
- mistral
- ai-security
- nist-ai-rmf
- mitre-atlas
- owasp-ai-exchange
- google-saif
- risk-management
- fine-tuned
language:
- en
pipeline_tag: text-generation
datasets:
- dbristol/aisec-training-data
library_name: mlx
---
# aisec_model_v1 — AI Security Framework Expert (Mistral 7B LoRA)
> **This is a fine-tuned version of [mistralai/Mistral-7B-Instruct-v0.3](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.3),
> not a new model architecture.** Only 0.145% of parameters were updated via
> LoRA. The base model weights, tokenizer, and architecture are unchanged.
Domain-specialised using LoRA on Apple Silicon via [MLX](https://github.com/ml-explore/mlx)
for cross-framework AI security and risk management analysis across:
- **NIST AI RMF 1.0** — Govern, Map, Measure, Manage functions
- **MITRE ATLAS** — Adversarial TTP kill chains and detection engineering
- **OWASP AI Exchange** — Runtime attack surfaces and technical controls
- **Google SAIF** — Component responsibility assignment and governance layers
---
## Model Details
| Property | Value |
|---|---|
| Base model | mistralai/Mistral-7B-Instruct-v0.3 |
| Fine-tuning method | LoRA (Low-Rank Adaptation) |
| Framework | MLX (Apple Silicon) |
| Trainable parameters | 10.486M / 7,248M (0.145%) |
| LoRA rank | 8 |
| LoRA alpha | 16 |
| LoRA layers | 16 |
| Training platform | Apple Silicon (M-series), macOS |
| Best checkpoint | Iter 500 (val loss 0.216) |
| Training dataset | [dbristol/aisec-training-data](https://huggingface.co/datasets/dbristol/aisec-training-data) |
---
## Training Summary
Training was performed using `mlx_lm.lora` with a cosine learning rate schedule.
| Checkpoint | Val Loss |
|---|---|
| Iter 1 (base) | 2.597 |
| Iter 100 | 0.749 |
| Iter 200 | 0.369 |
| Iter 300 | 0.312 |
| Iter 400 | 0.267 |
| **Iter 500** | **0.216** ← best |
| Iter 550 | 0.223 ↑ overfitting onset |
Training configuration:
```yaml
learning_rate: 5e-5
lr_schedule: cosine_decay (100-iter warmup)
batch_size: 4
iters: 1200
lora_rank: 8
lora_alpha: 16.0
lora_dropout: 0.05
num_layers: 16
```
---
## Usage
### Requirements
```bash
pip install mlx-lm
```
### Inference with MLX
```python
from mlx_lm import load, generate
model, tokenizer = load(
"Dbristol/aisec_model_v1"
)
prompt = "Provide a cross-framework analysis of indirect prompt injection defences \
for a code generation assistant using OWASP AI Exchange, SAIF, MITRE ATLAS, \
and NIST AI RMF."
messages = [
{
"role": "system",
"content": (
"You are an expert AI security and risk management assistant "
"specialising in NIST AI RMF 1.0, MITRE ATLAS, OWASP AI Exchange, "
"and Google SAIF frameworks."
)
},
{"role": "user", "content": prompt}
]
formatted = tokenizer.apply_chat_template(
messages,
tokenize=False,
add_generation_prompt=True
)
response = generate(
model,
tokenizer,
prompt=formatted,
max_tokens=512,
temp=0.4,
top_p=0.85,
)
print(response)
```
### Recommended inference parameters
| Parameter | Value | Rationale |
|---|---|---|
| temperature | 0.4 | Factual domain — sharper distribution favours trained signal |
| top_p | 0.85 | Tighter nucleus reduces long-tail sampling |
| top_k | 40 | Hard vocabulary cap applied before top_p |
| repeat_penalty | 1.1 | Reduces repetition of framework acronyms |
---
## Intended Use
This model is designed for security practitioners, researchers, and AI governance
professionals who need structured cross-framework analysis. Suitable use cases include:
- Mapping AI system risks across multiple frameworks simultaneously
- Generating NIST AI RMF governance documentation
- Identifying MITRE ATLAS TTPs relevant to a specific AI deployment
- Drafting OWASP AI Exchange control implementations
- Cross-referencing Google SAIF responsibility assignments
### Out-of-scope use
This model should not be used as the sole basis for security decisions without
human expert review. Framework guidance evolves; always verify against current
official documentation.
---
## Limitations
- Trained on a single-domain dataset; may underperform on security tasks outside
the four covered frameworks.
- Knowledge cutoff reflects the training data collection date, not live framework updates.
- Responses should be verified against official NIST, MITRE, OWASP, and Google SAIF
publications before operational use.
- Base model is Mistral 7B Instruct v0.3; inherits its general limitations.
---
## License
This model is released under [Apache 2.0](https://www.apache.org/licenses/LICENSE-2.0).
The base model ([Mistral-7B-Instruct-v0.3](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.3))
is also Apache 2.0 licensed.
The training dataset is derived from publicly available framework documentation.
See the [dataset card](https://huggingface.co/datasets/<your-hf-username>/aisec-training-data)
for full provenance and source attribution.
---
## Citation
If you use this model in research or production, please cite:
```bibtex
@misc{aisec_model_v1,
author = {<your-name>},
title = {aisec\_model\_v1: Mistral 7B Fine-Tuned for AI Security Framework Analysis},
year = {2026},
publisher = {HuggingFace},
url = {https://huggingface.co/dbristol/aisec_model_v1}
}
```