AlexXBueno commited on
Commit
7d59e75
·
verified ·
1 Parent(s): 1ccde08

update model card 2

Browse files
Files changed (1) hide show
  1. README.md +11 -8
README.md CHANGED
@@ -10,6 +10,11 @@ tags:
10
  - cti
11
  - ner
12
  - information-extraction
 
 
 
 
 
13
  ---
14
 
15
  # Model Card for Model ID
@@ -23,7 +28,7 @@ It transforms raw, technical text into structured JSON format containing cyberse
23
  ### Model Description
24
 
25
  This model uses QLoRA (Quantized Low-Rank Adaptation) to efficiently adapt the Mistral-7B base model for the highly specific task of Named Entity Recognition (NER) in the cybersecurity domain.
26
- The model outputs a strict JSON structure, making it ideal for integration into automated RAG pipelines, SIEMs, or autonomous agent workflows (like LangGraph).
27
 
28
  - **Developed by:** Alex Bueno
29
  - **Model type:** Causal Language Model with LoRA adapters (PEFT)
@@ -31,7 +36,7 @@ The model outputs a strict JSON structure, making it ideal for integration into
31
  - **License:** Apache 2.0
32
  - **Finetuned from model:** `mistralai/Mistral-7B-v0.3`
33
 
34
- ### Model Sources [optional]
35
 
36
  - **Repository:** https://huggingface.co/AlexXBueno/Mistral-7B-Cyber-Thread-Intelligence-Extractor
37
 
@@ -45,7 +50,7 @@ It will extract relevant entities and return them as a structured JSON array.
45
 
46
  ### Downstream Use
47
 
48
- - **Multi-Agent Systems:** As a specific Tool Node for an orchestrator agent (e.g., Llama-3-70B) to extract structured data before querying a Vector Database or SQL.
49
  - **CTI Pipelines:** Automated ingestion and structuring of daily threat reports into a local database.
50
 
51
 
@@ -55,7 +60,7 @@ The model may suffer from previous knowledge bias, which may leads to insert thr
55
 
56
  ### Recommendations
57
 
58
- - **Temperature:** It is strictly recommended to use a low temperature (`temperature=0.1` or `0.0`) during inference to ensure deterministic extraction.
59
  - **Validation:** Use Pydantic or structured decoding libraries (like `Outlines` or `Guidance`) in production to enforce JSON grammar, as the model may occasionally produce malformed JSON syntax.
60
 
61
 
@@ -123,12 +128,10 @@ print(tokenizer.decode(outputs[0], skip_special_tokens=True).split("### Response
123
 
124
  ### Training Data
125
 
126
- The model was fine-tuned on the ```mrmoor/cyber-threat-intelligence``` dataset, which contains annotated cybersecurity entities.
127
 
128
  ### Training Procedure
129
 
130
- <!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
131
-
132
  #### Preprocessing
133
 
134
  A custom Data Collator (```CTICompletionCollator```) was implemented during training.
@@ -157,7 +160,7 @@ The objective is strictly Information Extraction (IE) formatted as an Instructio
157
 
158
  ### Compute Infrastructure
159
 
160
- The entire stack was developed and validated on local/on-premise infrastructure, bypassing cloud dependencies to assure data privacy for sensitive CTI documents.
161
 
162
  #### Software
163
  - PEFT 0.18.1
 
10
  - cti
11
  - ner
12
  - information-extraction
13
+ license: apache-2.0
14
+ datasets:
15
+ - mrmoor/cyber-threat-intelligence
16
+ language:
17
+ - en
18
  ---
19
 
20
  # Model Card for Model ID
 
28
  ### Model Description
29
 
30
  This model uses QLoRA (Quantized Low-Rank Adaptation) to efficiently adapt the Mistral-7B base model for the highly specific task of Named Entity Recognition (NER) in the cybersecurity domain.
31
+ The model outputs a strict JSON structure, making it ideal for integration into automated RAG pipelines or autonomous agent workflows (like LangGraph).
32
 
33
  - **Developed by:** Alex Bueno
34
  - **Model type:** Causal Language Model with LoRA adapters (PEFT)
 
36
  - **License:** Apache 2.0
37
  - **Finetuned from model:** `mistralai/Mistral-7B-v0.3`
38
 
39
+ ### Model Sources
40
 
41
  - **Repository:** https://huggingface.co/AlexXBueno/Mistral-7B-Cyber-Thread-Intelligence-Extractor
42
 
 
50
 
51
  ### Downstream Use
52
 
53
+ - **Multi-Agent Systems:** As a specific Tool Node for an orchestrator agent to extract structured data before querying a Vector Database or SQL.
54
  - **CTI Pipelines:** Automated ingestion and structuring of daily threat reports into a local database.
55
 
56
 
 
60
 
61
  ### Recommendations
62
 
63
+ - **Temperature:** It is recommended to use a low temperature (`temperature=0.1` or `0.0`) during inference to ensure deterministic extraction.
64
  - **Validation:** Use Pydantic or structured decoding libraries (like `Outlines` or `Guidance`) in production to enforce JSON grammar, as the model may occasionally produce malformed JSON syntax.
65
 
66
 
 
128
 
129
  ### Training Data
130
 
131
+ The model was fine-tuned on the `mrmoor/cyber-threat-intelligence` dataset, which contains annotated cybersecurity entities.
132
 
133
  ### Training Procedure
134
 
 
 
135
  #### Preprocessing
136
 
137
  A custom Data Collator (```CTICompletionCollator```) was implemented during training.
 
160
 
161
  ### Compute Infrastructure
162
 
163
+ The entire stack was developed and validated on local infrastructure, avoiding cloud dependencies to esnure data privacy for sensitive CTI documents.
164
 
165
  #### Software
166
  - PEFT 0.18.1