Updated README

17a8cda verified about 1 month ago

5.49 kB

	---
	license: apache-2.0
	base_model: mistralai/Mistral-7B-Instruct-v0.3
	base_model_relation: finetune
	dbristol:
	- mlx
	- lora
	- mistral
	- ai-security
	- nist-ai-rmf
	- mitre-atlas
	- owasp-ai-exchange
	- google-saif
	- risk-management
	- fine-tuned
	language:
	- en
	pipeline_tag: text-generation
	datasets:
	- dbristol/aisec-training-data
	library_name: mlx
	---

	# aisec_model_v1 — AI Security Framework Expert (Mistral 7B LoRA)

	> **This is a fine-tuned version of [mistralai/Mistral-7B-Instruct-v0.3](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.3),
	> not a new model architecture.** Only 0.145% of parameters were updated via
	> LoRA. The base model weights, tokenizer, and architecture are unchanged.

	Domain-specialised using LoRA on Apple Silicon via [MLX](https://github.com/ml-explore/mlx)
	for cross-framework AI security and risk management analysis across:

	- NIST AI RMF 1.0 — Govern, Map, Measure, Manage functions
	- MITRE ATLAS — Adversarial TTP kill chains and detection engineering
	- OWASP AI Exchange — Runtime attack surfaces and technical controls
	- Google SAIF — Component responsibility assignment and governance layers

	---

	## Model Details

	\| Property \| Value \|
	\|---\|---\|
	\| Base model \| mistralai/Mistral-7B-Instruct-v0.3 \|
	\| Fine-tuning method \| LoRA (Low-Rank Adaptation) \|
	\| Framework \| MLX (Apple Silicon) \|
	\| Trainable parameters \| 10.486M / 7,248M (0.145%) \|
	\| LoRA rank \| 8 \|
	\| LoRA alpha \| 16 \|
	\| LoRA layers \| 16 \|
	\| Training platform \| Apple Silicon (M-series), macOS \|
	\| Best checkpoint \| Iter 500 (val loss 0.216) \|
	\| Training dataset \| [dbristol/aisec-training-data](https://huggingface.co/datasets/dbristol/aisec-training-data) \|

	---

	## Training Summary

	Training was performed using `mlx_lm.lora` with a cosine learning rate schedule.

	\| Checkpoint \| Val Loss \|
	\|---\|---\|
	\| Iter 1 (base) \| 2.597 \|
	\| Iter 100 \| 0.749 \|
	\| Iter 200 \| 0.369 \|
	\| Iter 300 \| 0.312 \|
	\| Iter 400 \| 0.267 \|
	\| Iter 500 \| 0.216 ← best \|
	\| Iter 550 \| 0.223 ↑ overfitting onset \|

	Training configuration:
	```yaml
	learning_rate: 5e-5
	lr_schedule: cosine_decay (100-iter warmup)
	batch_size: 4
	iters: 1200
	lora_rank: 8
	lora_alpha: 16.0
	lora_dropout: 0.05
	num_layers: 16
	```

	---

	## Usage

	### Requirements

	```bash
	pip install mlx-lm
	```

	### Inference with MLX

	```python
	from mlx_lm import load, generate

	model, tokenizer = load(
	"Dbristol/aisec_model_v1"
	)

	prompt = "Provide a cross-framework analysis of indirect prompt injection defences \
	for a code generation assistant using OWASP AI Exchange, SAIF, MITRE ATLAS, \
	and NIST AI RMF."

	messages = [
	{
	"role": "system",
	"content": (
	"You are an expert AI security and risk management assistant "
	"specialising in NIST AI RMF 1.0, MITRE ATLAS, OWASP AI Exchange, "
	"and Google SAIF frameworks."
	)
	},
	{"role": "user", "content": prompt}
	]

	formatted = tokenizer.apply_chat_template(
	messages,
	tokenize=False,
	add_generation_prompt=True
	)

	response = generate(
	model,
	tokenizer,
	prompt=formatted,
	max_tokens=512,
	temp=0.4,
	top_p=0.85,
	)
	print(response)
	```

	### Recommended inference parameters

	\| Parameter \| Value \| Rationale \|
	\|---\|---\|---\|
	\| temperature \| 0.4 \| Factual domain — sharper distribution favours trained signal \|
	\| top_p \| 0.85 \| Tighter nucleus reduces long-tail sampling \|
	\| top_k \| 40 \| Hard vocabulary cap applied before top_p \|
	\| repeat_penalty \| 1.1 \| Reduces repetition of framework acronyms \|

	---

	## Intended Use

	This model is designed for security practitioners, researchers, and AI governance
	professionals who need structured cross-framework analysis. Suitable use cases include:

	- Mapping AI system risks across multiple frameworks simultaneously
	- Generating NIST AI RMF governance documentation
	- Identifying MITRE ATLAS TTPs relevant to a specific AI deployment
	- Drafting OWASP AI Exchange control implementations
	- Cross-referencing Google SAIF responsibility assignments

	### Out-of-scope use

	This model should not be used as the sole basis for security decisions without
	human expert review. Framework guidance evolves; always verify against current
	official documentation.

	---

	## Limitations

	- Trained on a single-domain dataset; may underperform on security tasks outside
	the four covered frameworks.
	- Knowledge cutoff reflects the training data collection date, not live framework updates.
	- Responses should be verified against official NIST, MITRE, OWASP, and Google SAIF
	publications before operational use.
	- Base model is Mistral 7B Instruct v0.3; inherits its general limitations.

	---

	## License

	This model is released under [Apache 2.0](https://www.apache.org/licenses/LICENSE-2.0).

	The base model ([Mistral-7B-Instruct-v0.3](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.3))
	is also Apache 2.0 licensed.

	The training dataset is derived from publicly available framework documentation.
	See the [dataset card](https://huggingface.co/datasets/<your-hf-username>/aisec-training-data)
	for full provenance and source attribution.

	---

	## Citation

	If you use this model in research or production, please cite:

	```bibtex
	@misc{aisec_model_v1,
	author = {<your-name>},
	title = {aisec\_model\_v1: Mistral 7B Fine-Tuned for AI Security Framework Analysis},
	year = {2026},
	publisher = {HuggingFace},
	url = {https://huggingface.co/dbristol/aisec_model_v1}
	}
	```