update model card 2
Browse files
README.md
CHANGED
|
@@ -10,6 +10,11 @@ tags:
|
|
| 10 |
- cti
|
| 11 |
- ner
|
| 12 |
- information-extraction
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 13 |
---
|
| 14 |
|
| 15 |
# Model Card for Model ID
|
|
@@ -23,7 +28,7 @@ It transforms raw, technical text into structured JSON format containing cyberse
|
|
| 23 |
### Model Description
|
| 24 |
|
| 25 |
This model uses QLoRA (Quantized Low-Rank Adaptation) to efficiently adapt the Mistral-7B base model for the highly specific task of Named Entity Recognition (NER) in the cybersecurity domain.
|
| 26 |
-
The model outputs a strict JSON structure, making it ideal for integration into automated RAG pipelines
|
| 27 |
|
| 28 |
- **Developed by:** Alex Bueno
|
| 29 |
- **Model type:** Causal Language Model with LoRA adapters (PEFT)
|
|
@@ -31,7 +36,7 @@ The model outputs a strict JSON structure, making it ideal for integration into
|
|
| 31 |
- **License:** Apache 2.0
|
| 32 |
- **Finetuned from model:** `mistralai/Mistral-7B-v0.3`
|
| 33 |
|
| 34 |
-
### Model Sources
|
| 35 |
|
| 36 |
- **Repository:** https://huggingface.co/AlexXBueno/Mistral-7B-Cyber-Thread-Intelligence-Extractor
|
| 37 |
|
|
@@ -45,7 +50,7 @@ It will extract relevant entities and return them as a structured JSON array.
|
|
| 45 |
|
| 46 |
### Downstream Use
|
| 47 |
|
| 48 |
-
- **Multi-Agent Systems:** As a specific Tool Node for an orchestrator agent
|
| 49 |
- **CTI Pipelines:** Automated ingestion and structuring of daily threat reports into a local database.
|
| 50 |
|
| 51 |
|
|
@@ -55,7 +60,7 @@ The model may suffer from previous knowledge bias, which may leads to insert thr
|
|
| 55 |
|
| 56 |
### Recommendations
|
| 57 |
|
| 58 |
-
- **Temperature:** It is
|
| 59 |
- **Validation:** Use Pydantic or structured decoding libraries (like `Outlines` or `Guidance`) in production to enforce JSON grammar, as the model may occasionally produce malformed JSON syntax.
|
| 60 |
|
| 61 |
|
|
@@ -123,12 +128,10 @@ print(tokenizer.decode(outputs[0], skip_special_tokens=True).split("### Response
|
|
| 123 |
|
| 124 |
### Training Data
|
| 125 |
|
| 126 |
-
The model was fine-tuned on the `
|
| 127 |
|
| 128 |
### Training Procedure
|
| 129 |
|
| 130 |
-
<!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
|
| 131 |
-
|
| 132 |
#### Preprocessing
|
| 133 |
|
| 134 |
A custom Data Collator (```CTICompletionCollator```) was implemented during training.
|
|
@@ -157,7 +160,7 @@ The objective is strictly Information Extraction (IE) formatted as an Instructio
|
|
| 157 |
|
| 158 |
### Compute Infrastructure
|
| 159 |
|
| 160 |
-
The entire stack was developed and validated on local
|
| 161 |
|
| 162 |
#### Software
|
| 163 |
- PEFT 0.18.1
|
|
|
|
| 10 |
- cti
|
| 11 |
- ner
|
| 12 |
- information-extraction
|
| 13 |
+
license: apache-2.0
|
| 14 |
+
datasets:
|
| 15 |
+
- mrmoor/cyber-threat-intelligence
|
| 16 |
+
language:
|
| 17 |
+
- en
|
| 18 |
---
|
| 19 |
|
| 20 |
# Model Card for Model ID
|
|
|
|
| 28 |
### Model Description
|
| 29 |
|
| 30 |
This model uses QLoRA (Quantized Low-Rank Adaptation) to efficiently adapt the Mistral-7B base model for the highly specific task of Named Entity Recognition (NER) in the cybersecurity domain.
|
| 31 |
+
The model outputs a strict JSON structure, making it ideal for integration into automated RAG pipelines or autonomous agent workflows (like LangGraph).
|
| 32 |
|
| 33 |
- **Developed by:** Alex Bueno
|
| 34 |
- **Model type:** Causal Language Model with LoRA adapters (PEFT)
|
|
|
|
| 36 |
- **License:** Apache 2.0
|
| 37 |
- **Finetuned from model:** `mistralai/Mistral-7B-v0.3`
|
| 38 |
|
| 39 |
+
### Model Sources
|
| 40 |
|
| 41 |
- **Repository:** https://huggingface.co/AlexXBueno/Mistral-7B-Cyber-Thread-Intelligence-Extractor
|
| 42 |
|
|
|
|
| 50 |
|
| 51 |
### Downstream Use
|
| 52 |
|
| 53 |
+
- **Multi-Agent Systems:** As a specific Tool Node for an orchestrator agent to extract structured data before querying a Vector Database or SQL.
|
| 54 |
- **CTI Pipelines:** Automated ingestion and structuring of daily threat reports into a local database.
|
| 55 |
|
| 56 |
|
|
|
|
| 60 |
|
| 61 |
### Recommendations
|
| 62 |
|
| 63 |
+
- **Temperature:** It is recommended to use a low temperature (`temperature=0.1` or `0.0`) during inference to ensure deterministic extraction.
|
| 64 |
- **Validation:** Use Pydantic or structured decoding libraries (like `Outlines` or `Guidance`) in production to enforce JSON grammar, as the model may occasionally produce malformed JSON syntax.
|
| 65 |
|
| 66 |
|
|
|
|
| 128 |
|
| 129 |
### Training Data
|
| 130 |
|
| 131 |
+
The model was fine-tuned on the `mrmoor/cyber-threat-intelligence` dataset, which contains annotated cybersecurity entities.
|
| 132 |
|
| 133 |
### Training Procedure
|
| 134 |
|
|
|
|
|
|
|
| 135 |
#### Preprocessing
|
| 136 |
|
| 137 |
A custom Data Collator (```CTICompletionCollator```) was implemented during training.
|
|
|
|
| 160 |
|
| 161 |
### Compute Infrastructure
|
| 162 |
|
| 163 |
+
The entire stack was developed and validated on local infrastructure, avoiding cloud dependencies to esnure data privacy for sensitive CTI documents.
|
| 164 |
|
| 165 |
#### Software
|
| 166 |
- PEFT 0.18.1
|