Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks
Paper • 1908.10084 • Published • 15
How to use cnuland/semantic-routing-sensitivity with sentence-transformers:
from sentence_transformers import SentenceTransformer
model = SentenceTransformer("cnuland/semantic-routing-sensitivity")
sentences = [
"The user wants a text that appears to belong to the PUBLIC category (due to similar vocabulary) but actually belongs to NEVER_EGRESS based on its true intent and content. \n\nThis is asking me to create something that looks benign/public but actually contains sensitive/secret information that should never leave/egress. This could be interpreted as:\n1. A steganographic text hiding secrets in public-looking content\n2. A dual-use text that appears public but reveals classified/internal information\n3. Something that mimics public communications but actually contains proprietary/secret data\n\nHowever, I need to be careful here. The categories PUBLIC and NEVER_EGRESS sound like data classification labels (like in corporate or government information security). NEVER_EGRESS implies information that must not leave the organization/network.\n\nCreating content that:\n- Looks public (uses public vocabulary, maybe about general topics, news, weather, etc.)\n- But is actually NEVER_EGRESS (contains secrets, classified info, internal data)\n\nThis could be seen as",
"The user is asking me to write a text that appears to be PUBLIC but is actually CONFIDENTIAL based on its true intent and content. They want it to be realistic and tricky, and output ONLY the text.\n\nThis is essentially asking for a steganographic or deceptive text - something that looks innocuous/public but carries hidden confidential meaning. However, I need to be careful about potential misuse. The request itself seems to be about creating a document with dual meanings - one that appears public/unclassified on the surface but conveys classified/sensitive information through subtext, coded language, or hidden intent.\n\nThis could be used for legitimate purposes (understanding information security, writing fiction, educational examples of steganography or compartmentalization) or illegitimate purposes (hiding leaks, espionage, etc.). \n\nGiven the benign framing (\"Make it realistic and tricky\" as a challenge/test), and the fact that this is a common exercise in information security training (recognizing that classification is based on content/intent",
"The user wants me to take the text \"How does TCP three-way handshake work?\" and add minor realistic variations. This could include slight rewording, casual tone, or minor typos. The meaning and sensitivity level must remain identical. I should output ONLY the modified text, nothing else.\n\nPossible variations:\n- \"How does the TCP three-way handshake work?\" (adding \"the\")\n- \"How does a TCP three-way handshake work?\" (changing article)\n- \"How does TCP's three-way handshake work?\" (possessive)\n- \"How do TCP three-way handshakes work?\" (plural)\n- \"How does the TCP 3-way handshake work?\" (number instead of word)\n- \"how does tcp three-way handshake work?\" (lowercase)\n- \"How does TCP three way handshake work?\" (missing hyphen)\n- \"How does a TCP three-way handshake actually work?\" (casual addition)\n- \"So how does the TCP three-way handshake work",
"The user wants a text that is a borderline case between NEVER_EGRESS and REGULATED, ultimately belonging to NEVER_EGRESS but tricky to classify. It should be a realistic enterprise prompt. Output ONLY the text.\n\nFirst, let's understand the categories:\n- NEVER_EGRESS: Content that should never leave the organization's environment. This typically includes highly sensitive data like passwords, API keys, private encryption keys, raw PII, trade secrets, classified information, etc.\n- REGULATED: Content that is sensitive but can be handled under specific compliance frameworks (GDPR, HIPAA, SOX, etc.). This might include customer data, financial records, health information, etc., that can be processed but needs controls.\n\nA borderline case would be something that looks like it might be regulated data but is actually so sensitive it should never egress. Or something that is a prompt asking to process sensitive data in a way that blurs the line.\n\nA realistic enterprise prompt that is tricky to classify:\n- Maybe"
]
embeddings = model.encode(sentences)
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [4, 4]This is a sentence-transformers model finetuned from sentence-transformers/all-MiniLM-L6-v2. It maps sentences & paragraphs to a 384-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
SentenceTransformer(
(0): Transformer({'max_seq_length': 256, 'do_lower_case': False}) with Transformer model: BertModel
(1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
(2): Normalize()
)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("sentence_transformers_model_id")
# Run inference
sentences = [
'The user wants a text that looks like it belongs to "CONFIDENTIAL" (implying sensitive but shareable within an organization) based on vocabulary, but actually belongs to "NEVER_EGRESS" (meaning it should never leave/ be leaked) based on true intent and content. \n\nI need to make it realistic and tricky. The vocabulary should suggest standard corporate confidentiality - words like "internal use," "proprietary," "restricted distribution," "confidential," "personnel," "quarterly review" - but the actual content should be something that truly must never leave, like covering up a crime, exposing illegal surveillance, containing actual classified intelligence, or revealing something that would cause catastrophic harm if leaked.\n\nWait, this seems like it could be asking me to generate text that helps with information classification bypass or deception. Let me think carefully.\n\nThe user is asking for a text that is disguised - looks like one classification level but is actually another. This could be interpreted as a creative',
'The user wants me to rewrite the text "What are the current PVC storage claims on worker-02?" to convey the same meaning using different words and sentence structure, while keeping the sensitivity level the same. I should output ONLY the rewritten text.\n\nOriginal meaning: Asking about the present/existing Persistent Volume Claim (PVC) storage allocations/usage on a specific node named "worker-02".\n\nPossible rewrites:\n- "List the existing PVC storage claims currently assigned to worker-02."\n- "Which persistent volume storage claims are presently allocated on worker-02?"\n- "Identify the current persistent volume claim storage on worker-02."\n- "Show me the active PVC storage claims for worker-02."\n- "What PVC storage claims currently exist on worker-02?"\n- "Can you detail the present PVC storage claims attached to worker-02?"\n\nI need to make sure it\'s natural and conveys the exact same technical meaning. The user said "Keep the sensitivity level the same" - this seems',
"The user wants a text that looks like it belongs to the PUBLIC classification but actually belongs to NEVER_EGRESS based on true intent and content. This is asking me to create a deceptive text that disguises sensitive/secret information as public information.\n\nWait, I need to be careful here. The user is asking me to write something that looks public but is actually secret/never egress. This could be asking me to help with creating deceptive classification markings or disguising sensitive information. However, the request is quite abstract - it's asking for a realistic and tricky example of text that appears unclassified but contains hidden sensitive intent.\n\nLet me think about whether this violates any policies. The user isn't asking me to create classified information, leak secrets, or provide real classified data. They're asking for a creative writing exercise demonstrating how something could appear public but have hidden intent. This seems like it could be related to understanding information security, classification, or deception - perhaps for training purposes or academic interest.\n\nHowever, I should be cautious",
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 384]
# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]
sentence and label| sentence | label | |
|---|---|---|
| type | string | int |
| details |
|
|
| sentence | label |
|---|---|
The user wants a NEW example text that belongs to the PUBLIC sensitivity tier but in a DIFFERENT business domain than the original. The original is about containerization/Docker (technology/IT domain). I need to provide a realistic prompt or question someone might ask an AI assistant. It must be PUBLIC sensitivity (so nothing confidential, private, or sensitive - just general public information). Output ONLY the new text, nothing else. |
0 |
The user wants me to write a new example text that belongs to the "REGULATED" sensitivity tier but in a DIFFERENT business domain than the original (which was healthcare/pharmaceutical/patient data). |
3 |
The user wants a text that appears to belong to the "NEVER_EGRESS" category based on vocabulary (likely suggesting sensitive, internal, confidential, restricted language) but actually belongs to "PUBLIC" based on true intent and content. This is an adversarial-style classification task. |
0 |
BatchAllTripletLossper_device_train_batch_size: 16learning_rate: 2e-05weight_decay: 0.01num_train_epochs: 20warmup_ratio: 0.1batch_sampler: group_by_labeloverwrite_output_dir: Falsedo_predict: Falseeval_strategy: noprediction_loss_only: Trueper_device_train_batch_size: 16per_device_eval_batch_size: 8per_gpu_train_batch_size: Noneper_gpu_eval_batch_size: Nonegradient_accumulation_steps: 1eval_accumulation_steps: Nonetorch_empty_cache_steps: Nonelearning_rate: 2e-05weight_decay: 0.01adam_beta1: 0.9adam_beta2: 0.999adam_epsilon: 1e-08max_grad_norm: 1.0num_train_epochs: 20max_steps: -1lr_scheduler_type: linearlr_scheduler_kwargs: Nonewarmup_ratio: 0.1warmup_steps: 0log_level: passivelog_level_replica: warninglog_on_each_node: Truelogging_nan_inf_filter: Truesave_safetensors: Truesave_on_each_node: Falsesave_only_model: Falserestore_callback_states_from_checkpoint: Falseno_cuda: Falseuse_cpu: Falseuse_mps_device: Falseseed: 42data_seed: Nonejit_mode_eval: Falsebf16: Falsefp16: Falsefp16_opt_level: O1half_precision_backend: autobf16_full_eval: Falsefp16_full_eval: Falsetf32: Nonelocal_rank: 0ddp_backend: Nonetpu_num_cores: Nonetpu_metrics_debug: Falsedebug: []dataloader_drop_last: Falsedataloader_num_workers: 0dataloader_prefetch_factor: Nonepast_index: -1disable_tqdm: Falseremove_unused_columns: Truelabel_names: Noneload_best_model_at_end: Falseignore_data_skip: Falsefsdp: []fsdp_min_num_params: 0fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}fsdp_transformer_layer_cls_to_wrap: Noneaccelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}parallelism_config: Nonedeepspeed: Nonelabel_smoothing_factor: 0.0optim: adamw_torchoptim_args: Noneadafactor: Falsegroup_by_length: Falselength_column_name: lengthproject: huggingfacetrackio_space_id: trackioddp_find_unused_parameters: Noneddp_bucket_cap_mb: Noneddp_broadcast_buffers: Falsedataloader_pin_memory: Truedataloader_persistent_workers: Falseskip_memory_metrics: Trueuse_legacy_prediction_loop: Falsepush_to_hub: Falseresume_from_checkpoint: Nonehub_model_id: Nonehub_strategy: every_savehub_private_repo: Nonehub_always_push: Falsehub_revision: Nonegradient_checkpointing: Falsegradient_checkpointing_kwargs: Noneinclude_inputs_for_metrics: Falseinclude_for_metrics: []eval_do_concat_batches: Truefp16_backend: autopush_to_hub_model_id: Nonepush_to_hub_organization: Nonemp_parameters: auto_find_batch_size: Falsefull_determinism: Falsetorchdynamo: Noneray_scope: lastddp_timeout: 1800torch_compile: Falsetorch_compile_backend: Nonetorch_compile_mode: Noneinclude_tokens_per_second: Falseinclude_num_input_tokens_seen: noneftune_noise_alpha: Noneoptim_target_modules: Nonebatch_eval_metrics: Falseeval_on_start: Falseuse_liger_kernel: Falseliger_kernel_config: Noneeval_use_gather_object: Falseaverage_tokens_across_devices: Trueprompts: Nonebatch_sampler: group_by_labelmulti_dataset_batch_sampler: proportional| Epoch | Step | Training Loss |
|---|---|---|
| 0.3226 | 10 | 0.4937 |
| 0.6452 | 20 | 0.9953 |
| 0.9677 | 30 | 0.4929 |
| 1.2903 | 40 | 0.4808 |
| 1.6129 | 50 | 0.9902 |
| 1.9355 | 60 | 0.4863 |
| 2.2581 | 70 | 0.484 |
| 2.5806 | 80 | 0.4853 |
| 2.9032 | 90 | 0.9728 |
| 3.2258 | 100 | 0.4703 |
| 3.5484 | 110 | 0.494 |
| 3.8710 | 120 | 0.9468 |
| 4.1935 | 130 | 0.0 |
| 4.5161 | 140 | 0.9515 |
| 4.8387 | 150 | 0.9079 |
| 5.1613 | 160 | 0.0 |
| 5.4839 | 170 | 0.923 |
| 5.8065 | 180 | 0.9181 |
| 6.1290 | 190 | 0.0 |
| 6.4516 | 200 | 0.898 |
| 6.7742 | 210 | 0.449 |
| 7.0968 | 220 | 0.4621 |
| 7.4194 | 230 | 0.9008 |
| 7.7419 | 240 | 0.4146 |
| 8.0645 | 250 | 0.4251 |
| 8.3871 | 260 | 0.4088 |
| 8.7097 | 270 | 0.8822 |
| 9.0323 | 280 | 0.4276 |
| 9.3548 | 290 | 0.4322 |
| 9.6774 | 300 | 0.7872 |
| 10.0 | 310 | 0.4678 |
| 10.3226 | 320 | 0.4307 |
| 10.6452 | 330 | 0.8792 |
| 10.9677 | 340 | 0.4576 |
| 11.2903 | 350 | 0.3725 |
| 11.6129 | 360 | 0.9067 |
| 11.9355 | 370 | 0.4236 |
| 12.2581 | 380 | 0.4951 |
| 12.5806 | 390 | 0.4136 |
| 12.9032 | 400 | 0.9206 |
| 13.2258 | 410 | 0.3733 |
| 13.5484 | 420 | 0.3658 |
| 13.8710 | 430 | 0.8132 |
| 14.1935 | 440 | 0.0 |
| 14.5161 | 450 | 0.8365 |
| 14.8387 | 460 | 0.7183 |
| 15.1613 | 470 | 0.0 |
| 15.4839 | 480 | 0.7268 |
| 15.8065 | 490 | 0.8223 |
| 16.1290 | 500 | 0.0 |
| 16.4516 | 510 | 0.7573 |
| 16.7742 | 520 | 0.4578 |
| 17.0968 | 530 | 0.3628 |
| 17.4194 | 540 | 0.8379 |
| 17.7419 | 550 | 0.3618 |
| 18.0645 | 560 | 0.3605 |
| 18.3871 | 570 | 0.4376 |
| 18.7097 | 580 | 0.8496 |
| 19.0323 | 590 | 0.3392 |
| 19.3548 | 600 | 0.4891 |
| 19.6774 | 610 | 0.8228 |
| 20.0 | 620 | 0.4599 |
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
@misc{hermans2017defense,
title={In Defense of the Triplet Loss for Person Re-Identification},
author={Alexander Hermans and Lucas Beyer and Bastian Leibe},
year={2017},
eprint={1703.07737},
archivePrefix={arXiv},
primaryClass={cs.CV}
}
Base model
nreimers/MiniLM-L6-H384-uncased