cisco-ehsan commited on
Commit
6f0f603
·
verified ·
1 Parent(s): 8936563

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +7 -571
README.md CHANGED
@@ -17,353 +17,7 @@ widget:
17
  - >-
18
  Ensuring and supporting information protection awareness and training
19
  programs
20
- - source_sentence: >-
21
- Which of the following databases are required to be maintained by any system
22
- participating in an IPSec VPN?
23
- sentences:
24
- - >-
25
- Gatekeeper bypass through code signing exploitation represents a
26
- sophisticated attack vector targeting macOS's application verification
27
- mechanism. Understanding detection indicators requires examining both
28
- technical artifacts and behavioral patterns associated with compromised
29
- digital signatures.\n\n**Primary Technical Indicators:**\n\nCode signing
30
- certificate anomalies constitute the most direct indicator. Legitimate
31
- applications possess valid, unexpired certificates from trusted authorities
32
- like Apple or recognized developers. Suspicious indicators include
33
- self-signed certificates, expired certificates, certificates issued by
34
- unrecognized authorities, or certificates with unusual subject alternative
35
- names (SANs). The `codesign` command reveals signature validity, while
36
- examining certificate chains through Keychain Access exposes potential
37
- anomalies.\n\nBinary modification signatures often manifest as
38
- \\\"unsigned\\\" status for previously signed applications. Gatekeeper
39
- maintains a whitelist of notarized applications; unsigned binaries
40
- attempting execution trigger alerts in system logs located at
41
- `/var/log/system.log`. Additionally, applications with altered code signing
42
- identifiers (CSIDs) or modified entitlements may indicate
43
- tampering.\n\n**Behavioral and System-Level Indicators:**\n\nProcess
44
- execution from non-standard locations frequently accompanies successful
45
- bypasses. Legitimate Gatekeeper-approved applications typically execute from
46
- `/Applications` or user-specific application directories. Execution from
47
- temporary directories, Downloads folders, or unusual paths warrants
48
- investigation.\n\nNetwork behavior analysis reveals additional indicators.
49
- Compromised applications may exhibit unexpected network connections,
50
- particularly to suspicious domains or IP addresses not associated with the
51
- legitimate application's functionality. DNS queries to newly registered
52
- domains (NRDs) or domains with high entropy often indicate
53
- command-and-control communications.\n\n**MITRE ATT&CK Framework
54
- Alignment:**\n\nThis technique aligns with T1553.002 (Subvert Trust
55
- Controls: Code Signing). Adversaries exploit weaknesses in code signing
56
- verification processes, potentially through stolen certificates, certificate
57
- authority compromise, or exploitation of bypass mechanisms like manual
58
- allowlisting.\n\n**Detection and Response Strategies:**\n\nImplement
59
- comprehensive logging using the Unified Logging system with custom
60
- predicates monitoring `com.apple.securityd` events. Deploy endpoint
61
- detection solutions capable of real-time code signing validation and
62
- behavioral analysis. Regularly audit installed applications against
63
- known-good baselines, focusing on unsigned or suspiciously signed
64
- executables.\n\nNIST Cybersecurity Framework alignment emphasizes continuous
65
- monitoring (DE.CM) and anomaly detection capabilities within the Detect
66
- function, ensuring organizations maintain visibility into potential
67
- Gatekeeper bypass attempts through robust logging and behavioral analysis
68
- mechanisms.
69
- - Security Policy Database (SPD) and Security Association Database (SAD)
70
- - Virus
71
- - source_sentence: >-
72
- How is a supply chain attack implemented through compromised software
73
- development kits (SDKs) and their propagation to thousands of applications?
74
- sentences:
75
- - >-
76
- Detecting security label tampering through extended attributes (xattrs)
77
- requires implementing comprehensive monitoring and validation mechanisms
78
- aligned with NIST Cybersecurity Framework's Detect function and MITRE
79
- ATT&CK's Defense Evasion tactics.\n\n**Xattr Monitoring
80
- Techniques:**\n\nImplement real-time file system monitoring using tools like
81
- `auditd` or Windows Event Tracing to track xattr modifications. Configure
82
- audit rules targeting specific security-critical files and directories,
83
- focusing on operations like `SETXATTR`, `GETXATTR`, and `LISTXATTR`. This
84
- aligns with NIST CSF DE.CM-1 (continuous monitoring) by establishing
85
- baseline behaviors for legitimate xattr usage patterns.\n\n**Integrity
86
- Validation Methods:**\n\nDeploy cryptographic hashing of security labels
87
- stored in xattrs, creating immutable reference values. Implement periodic
88
- verification against these baselines using SHA-256 or stronger algorithms.
89
- This corresponds to NIST CSF PR.DS-6 (integrity checking mechanisms) and
90
- provides detection capabilities for unauthorized
91
- modifications.\n\n**Behavioral Analysis:**\n\nEstablish user and process
92
- behavior profiling for xattr operations, identifying anomalous patterns that
93
- deviate from established baselines. Monitor for unusual privilege escalation
94
- attempts modifying security labels, particularly focusing on MITRE ATT&CK
95
- technique T1562.008 (Impair Defenses: Disable or Modify Tools) where
96
- adversaries manipulate security mechanisms.\n\n**System
97
- Integration:**\n\nLeverage SELinux or AppArmor mandatory access controls to
98
- restrict unauthorized xattr modifications. Implement centralized logging
99
- aggregation correlating xattr changes with process execution and network
100
- activities, enabling correlation analysis for sophisticated tampering
101
- attempts.\n\n**Detection Signatures:**\n\nDevelop custom detection rules
102
- identifying suspicious xattr patterns, including rapid successive
103
- modifications, bulk security label changes across multiple files, or
104
- modifications from unexpected processes. Integrate these signatures into
105
- SIEM platforms for automated alerting and incident response
106
- workflows.\n\nThis multi-layered approach provides comprehensive coverage
107
- against sophisticated tampering attempts while maintaining operational
108
- efficiency through targeted monitoring strategies.
109
- - >-
110
- Supply chain attacks occur when an attacker injects malicious code into
111
- trusted components in the software supply chain, such as open source
112
- libraries or SDKs. These components are then distributed to many developers
113
- and organizations worldwide. Once they integrate these seemingly legitimate
114
- tools into their own products, the malware is automatically embedded within
115
- them, propagating widely across various applications and devices. Attackers
116
- can also compromise update servers that deliver patches to millions of
117
- systems simultaneously. The Sunburst attack on SolarWinds was one such
118
- supply chain attack where a malicious update was pushed through the Orion
119
- update server. In this case, attackers used the compromised SDK from Pulse
120
- Secure to propagate the malware. Because Pulse Secure is used by many
121
- organizations for secure remote access solutions, their software development
122
- kit was distributed as part of legitimate downloads. Attackers then inserted
123
- their own malicious code into that SDK, which in turn infected all products
124
- built using it. This attack caused massive damage and forced a significant
125
- number of companies to implement new policies regarding software updates and
126
- vendor trustworthiness. The SolarWinds supply chain attack also demonstrated
127
- the importance of monitoring for suspicious network traffic patterns and
128
- adopting multi-factor authentication (MFA) to limit access to sensitive
129
- systems. Attackers can easily bypass traditional security measures if they
130
- manage to compromise a legitimate update server or SDK provider. Thus, it is
131
- essential for companies to carefully vet their third-party software
132
- providers and implement strict controls around the development lifecycle of
133
- critical applications. Furthermore, adopting robust patch management
134
- strategies and keeping an eye on suspicious activity in software
135
- repositories are important steps toward reducing the risk of supply chain
136
- attacks. Finally, implementing continuous monitoring tools like SIEM can
137
- provide early warning signs if any unexpected changes occur within a
138
- software component or its corresponding repository. Overall, understanding
139
- how supply chain attacks work is crucial for building more secure systems
140
- and ensuring that organizations do not inadvertently become victims of such
141
- sophisticated breaches in the future. Attackers can easily bypass
142
- traditional security measures if they manage to compromise a legitimate
143
- update server or SDK provider. Thus, it is essential for companies to
144
- carefully vet their third-party software providers and implement strict
145
- controls around the development lifecycle of critical applications.
146
- Furthermore, adopting robust patch management strategies and keeping an eye
147
- on suspicious activity in software repositories are important steps toward
148
- reducing the risk of supply chain attacks. Finally, implementing continuous
149
- monitoring tools like SIEM can provide early warning signs if any unexpected
150
- changes occur within a software component or its corresponding repository.
151
- - >-
152
- An automated response system for isolating compromised systems requires a
153
- multi-layered architecture integrating detection capabilities with
154
- orchestrated containment actions, aligned with NIST Cybersecurity
155
- Framework's Respond (RS) function and MITRE ATT&CK defensive
156
- strategies.\n\n**Core Architecture Components:**\n\nThe system should
157
- implement Security Orchestration, Automation, and Response (SOAR) platforms
158
- integrated with Security Information and Event Management (SIEM) systems.
159
- Central components include: detection engines processing indicators of
160
- compromise (IoCs), automated decision matrices for risk assessment, and
161
- isolation mechanisms that can quarantine affected assets without disrupting
162
- critical operations.\n\n**Detection Integration:**\n\nLeverage MITRE ATT&CK
163
- techniques to establish comprehensive monitoring across the attack
164
- lifecycle. Implement behavioral analytics detecting tactics like Initial
165
- Access (T1566 Phishing), Execution (T1059 Command and Scripting
166
- Interpreter), and Defense Evasion (T1027 Obfuscated Files). Deploy endpoint
167
- detection and response (EDR) solutions monitoring process execution, network
168
- communications, and file system modifications. Integrate threat intelligence
169
- feeds correlating observed indicators with known exploitation
170
- campaigns.\n\n**Automated Response Logic:**\n\nDesign tiered response
171
- capabilities based on confidence levels and asset criticality. Implement
172
- network microsegmentation enabling granular isolation through
173
- software-defined networking (SDN) controllers. Automated actions should
174
- include: DNS sinkholing for malicious domains, firewall rule deployment
175
- blocking suspicious traffic patterns, and network switch port isolation.
176
- Critical systems require graceful degradation procedures maintaining
177
- business continuity.\n\n**Decision Framework:**\n\nEstablish risk scoring
178
- algorithms incorporating asset value, threat severity, and exploitation
179
- likelihood. Implement approval workflows for high-confidence isolations
180
- while enabling rapid containment for confirmed compromises. Integration with
181
- Configuration Management Databases (CMDB) ensures accurate asset inventory
182
- and dependency mapping before executing isolation
183
- procedures.\n\n**Validation and Recovery:**\n\nPost-isolation processes
184
- should include automated forensic data collection, incident classification
185
- against MITRE ATT&CK framework, and coordinated recovery procedures.
186
- Implement continuous monitoring ensuring isolation effectiveness while
187
- maintaining operational readiness for subsequent threats.
188
- - source_sentence: >-
189
- What are the best practices for SOC teams to enhance their threat hunting
190
- capabilities against ScreenConnect vulnerabilities?
191
- sentences:
192
- - >-
193
- The hiberfil.sys file represents a critical artifact in digital forensics
194
- for establishing temporal context and system state at specific points in
195
- time. This Windows hibernation file contains compressed memory contents when
196
- a system enters power-saving mode, preserving volatile data including
197
- running processes, loaded drivers, and network connections.\n\n**Timeline
198
- Establishment Through Metadata Analysis**\n\nThe creation timestamp of
199
- hiberfil.sys provides definitive evidence of the last hibernation event,
200
- establishing a concrete temporal anchor point. This timestamp corresponds to
201
- the exact moment Windows initiated hibernation mode, typically occurring
202
- during system shutdown or power management events. By analyzing this
203
- metadata alongside related artifacts like registry entries
204
- (HKLM\\\\SYSTEM\\\\CurrentControlSet\\\\Control\\\\Power) and Event Viewer
205
- logs (Event ID 4634 for logoff), investigators can reconstruct precise
206
- chronological sequences.\n\n**MITRE ATT&CK Framework Integration**\n\nWithin
207
- the MITRE ATT&CK framework, hiberfil.sys analysis supports detection of
208
- several techniques including T1055 (Process Injection) and T1070.004 (File
209
- Deletion). Adversaries may attempt to manipulate or delete this file to
210
- eliminate forensic evidence, making its presence or absence significant for
211
- timeline reconstruction.\n\n**NIST Cybersecurity Framework
212
- Alignment**\n\nFollowing NIST CSF guidelines, particularly the Respond (RS)
213
- and Recover (RC) functions, hiberfil.sys analysis supports incident response
214
- activities. The metadata provides crucial temporal context for understanding
215
- attack progression and system compromise duration.\n\n**Technical
216
- Implementation Considerations**\n\nForensic investigators should examine
217
- multiple timestamp artifacts: creation time, last modification time, and
218
- access time. However, the creation timestamp remains most reliable as it
219
- reflects the initial hibernation event rather than subsequent file
220
- operations that might alter other timestamps through normal system
221
- operations or potential tampering attempts.\n\nThe hiberfil.sys file size
222
- typically correlates with installed RAM capacity, providing additional
223
- contextual information about system configuration at the time of
224
- hibernation. This metadata correlation strengthens timeline accuracy by
225
- validating system state consistency across multiple forensic artifacts.
226
- - >-
227
- Organizations can effectively leverage log analysis and Endpoint Detection
228
- and Response (EDR) systems to monitor Alternate Data Streams (ADS) through a
229
- multi-layered approach that addresses both detection capabilities and
230
- operational visibility.\n\n**Log Analysis Implementation:**\nWindows Event
231
- Logs provide foundational monitoring through Security, System, and
232
- Application logs. Critical events include 4663 (object access), 4688
233
- (process creation), and 5140 (file share access). Organizations should
234
- configure advanced audit policies for \\\"Audit File System\\\" and
235
- \\\"Audit Handle Manipulation\\\" under Local Security Policy. Sysmon
236
- configuration becomes essential, particularly Event ID 2 (CreateFile) and
237
- Event ID 3 (NetworkConnect), as these capture detailed file system
238
- interactions that standard Windows logs might miss.\n\n**EDR System
239
- Configuration:**\nModern EDR platforms like CrowdStrike, SentinelOne, or
240
- Microsoft Defender for Endpoint offer native ADS detection capabilities.
241
- These systems should be configured to monitor:\n- File creation/modification
242
- events with stream enumeration\n- Process access to files with multiple data
243
- streams\n- Registry modifications associated with ADS-enabled
244
- applications\n- Network communications from processes accessing hidden
245
- streams\n\n**Critical Directory Monitoring:**\nSystem directories requiring
246
- enhanced monitoring include %SystemRoot%, %ProgramFiles%, and user profile
247
- directories. Implement baseline integrity monitoring using tools like
248
- Microsoft's Attack Surface Reduction (ASR) rules or custom PowerShell
249
- scripts that enumerate ADS presence through Get-ItemProperty -Name \\\"*\\\"
250
- commands.\n\n**MITRE ATT&CK Alignment:**\nThis approach addresses T1096
251
- (NTFS File Attributes), T1547.001 (Registry Run Keys/Startup Folder), and
252
- T1564.002 (Impair Defenses: Disable or Modify Tools). Detection rules should
253
- correlate ADS creation with suspicious process ancestry, particularly
254
- PowerShell execution or living-off-the-land binaries.\n\n**Operational
255
- Integration:**\nEstablish automated response workflows that quarantine
256
- systems exhibiting ADS anomalies while preserving forensic evidence.
257
- Implement centralized logging aggregation using SIEM platforms configured to
258
- detect patterns indicating ADS abuse, such as rapid stream creation followed
259
- by executable access attempts.\n\nThis comprehensive monitoring strategy
260
- ensures organizations maintain visibility into ADS activities while
261
- minimizing false positives through contextual analysis and behavioral
262
- correlation.
263
- - >-
264
- SOC teams can enhance their threat hunting capabilities against
265
- ScreenConnect vulnerabilities by adopting a proactive and iterative approach
266
- to searching for indicators of compromise (IoCs) and anomalous activities
267
- that may indicate exploitation. Develop and regularly update threat hunting
268
- hypotheses based on the latest threat intelligence, focusing on known TTPs
269
- associated with the exploitation of ScreenConnect vulnerabilities. Utilize
270
- advanced analytics and machine learning tools to sift through large volumes
271
- of data for patterns and anomalies that may signify malicious activity.
272
- Leverage endpoint detection and response (EDR) tools to continuously monitor
273
- endpoints for signs of exploitation, such as unusual PowerShell command
274
- execution, modification of system files, or unexpected network connections.
275
- Conduct regular vulnerability scans and penetration tests to identify and
276
- remediate potential weaknesses in ScreenConnect and other critical systems
277
- before attackers can exploit them. Foster collaboration and information
278
- sharing with other organizations and cybersecurity communities to gain
279
- insights into emerging threats and effective detection and response
280
- strategies. Invest in continuous training and development for SOC team
281
- members to keep them abreast of the latest cybersecurity trends, tools, and
282
- techniques. By implementing these best practices, SOC teams can
283
- significantly improve their ability to detect and respond to threats
284
- targeting ScreenConnect vulnerabilities, thereby enhancing the overall
285
- security posture of their organization.
286
- - source_sentence: >-
287
- How would you use Amcache analysis to detect fileless malware that drops
288
- temporary components for initial system compromise?
289
- sentences:
290
- - >-
291
- # Automated Extraction of Empire Agent Configurations: Defensive
292
- Analysis\n\n## NIST Cybersecurity Framework Context\n\nWithin the NIST CSF's
293
- **Detect (DE)** and **Respond (RS)** functions, organizations must implement
294
- capabilities to identify and analyze malicious configurations. PowerShell
295
- Empire represents a sophisticated post-exploitation framework mapped to
296
- MITRE ATT&CK techniques including T1059.001 (PowerShell) and T1027
297
- (Obfuscated Files or Information).\n\n## Detection and Analysis
298
- Methodology\n\n**Memory Forensics Approach:**\nDevelop automated tools
299
- leveraging memory acquisition frameworks like Volatility or Rekall to
300
- identify Empire's in-memory artifacts. Focus on detecting:\n- PowerShell
301
- reflection objects characteristic of Empire's module loading\n-
302
- Base64-encoded configuration blobs within process memory spaces\n- Registry
303
- keys containing encoded agent parameters (typically
304
- HKLM\\\\SOFTWARE\\\\Classes\\\\ms-settings)\n\n**File System
305
- Analysis:**\nImplement scanning mechanisms targeting:\n- Temporary
306
- directories where Empire extracts configurations\n- PowerShell execution
307
- logs revealing obfuscated command patterns\n- Event log analysis for
308
- suspicious PowerShell execution contexts\n\n## Technical Implementation
309
- Framework\n\n**Automated Extraction Pipeline:**\n1. **Signature-Based
310
- Detection:** Develop YARA rules identifying Empire's distinctive code
311
- patterns and configuration structures\n2. **Memory Parsing:** Implement
312
- plugins parsing .NET objects and PowerShell runspaces\n3. **Decryption
313
- Routines:** Create automated decoding mechanisms for Empire's XOR-based
314
- configuration encryption\n4. **Artifact Correlation:** Cross-reference
315
- multiple data sources to validate findings\n\n**MITRE ATT&CK Mapping:**\n-
316
- T1083 (File and Directory Discovery)\n- T1057 (Process Discovery)\n- T1005
317
- (Data from Local System)\n\n## Defensive Considerations\n\nTools must
318
- incorporate anti-evasion techniques, including detection of common
319
- obfuscation methods like string concatenation and variable substitution.
320
- Integration with SIEM platforms enables real-time alerting when Empire
321
- artifacts are discovered.\n\n**Validation Framework:**\nImplement
322
- multi-layered validation ensuring extracted configurations correspond to
323
- active threats rather than benign PowerShell activity. This includes
324
- behavioral analysis correlating configuration parameters with observed
325
- network communications and file system modifications
326
- - To capture and display network traffic
327
- - >-
328
- Amcache analysis provides critical forensic artifacts for detecting fileless
329
- malware employing temporary component deployment during initial system
330
- compromise, aligning with MITRE ATT&CK techniques T1055 (Process Injection)
331
- and T1620 (Reflective Code Loading).\n\n**Amcache Artifact Analysis
332
- Framework:**\n\nThe Amcache.hve registry hive maintains comprehensive
333
- application execution metadata, including file paths, hashes, and execution
334
- timestamps. For fileless malware detection, focus on:\n\n1. **Temporary File
335
- Creation Patterns**: Analyze entries with suspicious temporal clustering in
336
- the \\\"Programs\\\" key, particularly executables stored in system
337
- directories (C:\\\\Windows\\\\Temp,
338
- C:\\\\Users\\\\[User]\\\\AppData\\\\Local\\\\Temp). Legitimate applications
339
- typically exhibit predictable installation patterns, while malicious
340
- components often manifest as isolated, recently-created executables.\n\n2.
341
- **Hash-Based Indicators**: Cross-reference SHA-1 hashes against threat
342
- intelligence feeds and known malware signatures. Fileless malware frequently
343
- employs legitimate system binaries for process hollowing (T1055.012) or
344
- reflective DLL loading (T1620), making hash analysis crucial for identifying
345
- repurposed executables.\n\n3. **Execution Chain Analysis**: Examine
346
- parent-child relationships within Amcache entries to identify anomalous
347
- process spawning patterns. Fileless malware often exhibits unusual execution
348
- chains, particularly when temporary components spawn from unexpected parent
349
- processes or system services.\n\n**NIST CSF Implementation
350
- Strategy:**\n\nUnder the Detect (DE) function, specifically DE.AE-2
351
- (Detected events are analyzed), implement continuous Amcache monitoring
352
- through:\n\n- **Baseline Establishment**: Create organizational baselines
353
- for normal temporary file creation patterns and execution behaviors\n-
354
- **Anomaly Detection**: Deploy automated analysis tools to identify
355
- deviations from established baselines\n- **Correlation Analysis**: Integrate
356
- Amcache findings with network traffic analysis and endpoint detection
357
- systems\n\n**Advanced Detection Methodologies:**\n\nUtilize PowerShell-based
358
- parsing scripts or specialized forensic tools like KAPE to extract and
359
- analyze Amcache artifacts. Focus on:\n\n- Unusual file extensions in
360
- temporary directories\n- Executables created immediately before suspicious
361
- network activity\n- Components with execution timestamps correlating with
362
- initial access events\n- Hash collisions or similarities between temporary
363
- files and known malware families\n\nThis approach enables proactive
364
- identification of fileless malware campaigns leveraging temporary components
365
- for system compromise, supporting comprehensive threat hunting and incident
366
- response activities within enterprise environments.
367
  pipeline_tag: sentence-similarity
368
  library_name: sentence-transformers
369
  base_model:
@@ -435,41 +89,6 @@ print(similarities)
435
  # [ 0.0078, -0.0407, 1.0000]])
436
  ```
437
 
438
- <!--
439
- ### Direct Usage (Transformers)
440
-
441
- <details><summary>Click to see the direct usage in Transformers</summary>
442
-
443
- </details>
444
- -->
445
-
446
- <!--
447
- ### Downstream Usage (Sentence Transformers)
448
-
449
- You can finetune this model on your own dataset.
450
-
451
- <details><summary>Click to expand</summary>
452
-
453
- </details>
454
- -->
455
-
456
- <!--
457
- ### Out-of-Scope Use
458
-
459
- *List how the model may foreseeably be misused and address what users ought not to do with the model.*
460
- -->
461
-
462
- <!--
463
- ## Bias, Risks and Limitations
464
-
465
- *What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
466
- -->
467
-
468
- <!--
469
- ### Recommendations
470
-
471
- *What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
472
- -->
473
 
474
  ## Training Details
475
 
@@ -507,155 +126,6 @@ You can finetune this model on your own dataset.
507
  - `num_train_epochs`: 20
508
  - `multi_dataset_batch_sampler`: round_robin
509
 
510
- #### All Hyperparameters
511
- <details><summary>Click to expand</summary>
512
-
513
- - `overwrite_output_dir`: False
514
- - `do_predict`: False
515
- - `eval_strategy`: steps
516
- - `prediction_loss_only`: True
517
- - `per_device_train_batch_size`: 32
518
- - `per_device_eval_batch_size`: 32
519
- - `per_gpu_train_batch_size`: None
520
- - `per_gpu_eval_batch_size`: None
521
- - `gradient_accumulation_steps`: 1
522
- - `eval_accumulation_steps`: None
523
- - `torch_empty_cache_steps`: None
524
- - `learning_rate`: 5e-05
525
- - `weight_decay`: 0.0
526
- - `adam_beta1`: 0.9
527
- - `adam_beta2`: 0.999
528
- - `adam_epsilon`: 1e-08
529
- - `max_grad_norm`: 1
530
- - `num_train_epochs`: 20
531
- - `max_steps`: -1
532
- - `lr_scheduler_type`: linear
533
- - `lr_scheduler_kwargs`: {}
534
- - `warmup_ratio`: 0.0
535
- - `warmup_steps`: 0
536
- - `log_level`: passive
537
- - `log_level_replica`: warning
538
- - `log_on_each_node`: True
539
- - `logging_nan_inf_filter`: True
540
- - `save_safetensors`: True
541
- - `save_on_each_node`: False
542
- - `save_only_model`: False
543
- - `restore_callback_states_from_checkpoint`: False
544
- - `no_cuda`: False
545
- - `use_cpu`: False
546
- - `use_mps_device`: False
547
- - `seed`: 42
548
- - `data_seed`: None
549
- - `jit_mode_eval`: False
550
- - `use_ipex`: False
551
- - `bf16`: False
552
- - `fp16`: False
553
- - `fp16_opt_level`: O1
554
- - `half_precision_backend`: auto
555
- - `bf16_full_eval`: False
556
- - `fp16_full_eval`: False
557
- - `tf32`: None
558
- - `local_rank`: 0
559
- - `ddp_backend`: None
560
- - `tpu_num_cores`: None
561
- - `tpu_metrics_debug`: False
562
- - `debug`: []
563
- - `dataloader_drop_last`: True
564
- - `dataloader_num_workers`: 0
565
- - `dataloader_prefetch_factor`: None
566
- - `past_index`: -1
567
- - `disable_tqdm`: False
568
- - `remove_unused_columns`: True
569
- - `label_names`: None
570
- - `load_best_model_at_end`: False
571
- - `ignore_data_skip`: False
572
- - `fsdp`: []
573
- - `fsdp_min_num_params`: 0
574
- - `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
575
- - `fsdp_transformer_layer_cls_to_wrap`: None
576
- - `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
577
- - `deepspeed`: None
578
- - `label_smoothing_factor`: 0.0
579
- - `optim`: adamw_torch
580
- - `optim_args`: None
581
- - `adafactor`: False
582
- - `group_by_length`: False
583
- - `length_column_name`: length
584
- - `ddp_find_unused_parameters`: None
585
- - `ddp_bucket_cap_mb`: None
586
- - `ddp_broadcast_buffers`: False
587
- - `dataloader_pin_memory`: True
588
- - `dataloader_persistent_workers`: False
589
- - `skip_memory_metrics`: True
590
- - `use_legacy_prediction_loop`: False
591
- - `push_to_hub`: False
592
- - `resume_from_checkpoint`: None
593
- - `hub_model_id`: None
594
- - `hub_strategy`: every_save
595
- - `hub_private_repo`: None
596
- - `hub_always_push`: False
597
- - `gradient_checkpointing`: False
598
- - `gradient_checkpointing_kwargs`: None
599
- - `include_inputs_for_metrics`: False
600
- - `include_for_metrics`: []
601
- - `eval_do_concat_batches`: True
602
- - `fp16_backend`: auto
603
- - `push_to_hub_model_id`: None
604
- - `push_to_hub_organization`: None
605
- - `mp_parameters`:
606
- - `auto_find_batch_size`: False
607
- - `full_determinism`: False
608
- - `torchdynamo`: None
609
- - `ray_scope`: last
610
- - `ddp_timeout`: 1800
611
- - `torch_compile`: False
612
- - `torch_compile_backend`: None
613
- - `torch_compile_mode`: None
614
- - `include_tokens_per_second`: False
615
- - `include_num_input_tokens_seen`: False
616
- - `neftune_noise_alpha`: None
617
- - `optim_target_modules`: None
618
- - `batch_eval_metrics`: False
619
- - `eval_on_start`: False
620
- - `use_liger_kernel`: False
621
- - `eval_use_gather_object`: False
622
- - `average_tokens_across_devices`: False
623
- - `prompts`: None
624
- - `batch_sampler`: batch_sampler
625
- - `multi_dataset_batch_sampler`: round_robin
626
- - `router_mapping`: {}
627
- - `learning_rate_mapping`: {}
628
-
629
- </details>
630
-
631
- ### Training Logs
632
- | Epoch | Step | Training Loss |
633
- |:-------:|:----:|:-------------:|
634
- | 1.0 | 139 | - |
635
- | 2.0 | 278 | - |
636
- | 3.0 | 417 | - |
637
- | 3.5971 | 500 | 1.1678 |
638
- | 4.0 | 556 | - |
639
- | 5.0 | 695 | - |
640
- | 6.0 | 834 | - |
641
- | 7.0 | 973 | - |
642
- | 7.1942 | 1000 | 0.0258 |
643
- | 8.0 | 1112 | - |
644
- | 9.0 | 1251 | - |
645
- | 10.0 | 1390 | - |
646
- | 10.7914 | 1500 | 0.0037 |
647
- | 11.0 | 1529 | - |
648
- | 12.0 | 1668 | - |
649
- | 13.0 | 1807 | - |
650
- | 14.0 | 1946 | - |
651
- | 14.3885 | 2000 | 0.0016 |
652
- | 15.0 | 2085 | - |
653
- | 16.0 | 2224 | - |
654
- | 17.0 | 2363 | - |
655
- | 17.9856 | 2500 | 0.0009 |
656
- | 18.0 | 2502 | - |
657
- | 19.0 | 2641 | - |
658
- | 20.0 | 2780 | - |
659
 
660
 
661
  ### Framework Versions
@@ -669,47 +139,13 @@ You can finetune this model on your own dataset.
669
 
670
  ## Citation
671
 
672
- ### BibTeX
673
 
674
- #### Sentence Transformers
675
- ```bibtex
676
- @inproceedings{reimers-2019-sentence-bert,
677
- title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
678
- author = "Reimers, Nils and Gurevych, Iryna",
679
- booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
680
- month = "11",
681
- year = "2019",
682
- publisher = "Association for Computational Linguistics",
683
- url = "https://arxiv.org/abs/1908.10084",
684
- }
685
  ```
686
-
687
- #### MultipleNegativesRankingLoss
688
- ```bibtex
689
- @misc{henderson2017efficient,
690
- title={Efficient Natural Language Response Suggestion for Smart Reply},
691
- author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
692
- year={2017},
693
- eprint={1705.00652},
694
- archivePrefix={arXiv},
695
- primaryClass={cs.CL}
696
  }
697
  ```
698
-
699
- <!--
700
- ## Glossary
701
-
702
- *Clearly define terms in order to be accessible across audiences.*
703
- -->
704
-
705
- <!--
706
- ## Model Card Authors
707
-
708
- *Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
709
- -->
710
-
711
- <!--
712
- ## Model Card Contact
713
-
714
- *Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
715
- -->
 
17
  - >-
18
  Ensuring and supporting information protection awareness and training
19
  programs
20
+
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
21
  pipeline_tag: sentence-similarity
22
  library_name: sentence-transformers
23
  base_model:
 
89
  # [ 0.0078, -0.0407, 1.0000]])
90
  ```
91
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
92
 
93
  ## Training Details
94
 
 
126
  - `num_train_epochs`: 20
127
  - `multi_dataset_batch_sampler`: round_robin
128
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
129
 
130
 
131
  ### Framework Versions
 
139
 
140
  ## Citation
141
 
142
+ ## Reference
143
 
 
 
 
 
 
 
 
 
 
 
 
144
  ```
145
+ @article{aghaei2025securebert,
146
+ title={SecureBERT 2.0: Advanced Language Model for Cybersecurity Intelligence},
147
+ author={Aghaei, Ehsan and Jain, Sarthak and Arun, Prashanth and Sambamoorthy, Arjun},
148
+ journal={arXiv preprint arXiv:2510.00240},
149
+ year={2025}
 
 
 
 
 
150
  }
151
  ```