Sentence Similarity
sentence-transformers
Safetensors
qwen3
feature-extraction
dense
Generated from Trainer
dataset_size:49346
loss:MultipleNegativesRankingLoss
text-embeddings-inference
Instructions to use ThienLe/Qwen3-SecEmbed with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- sentence-transformers
How to use ThienLe/Qwen3-SecEmbed with sentence-transformers:
from sentence_transformers import SentenceTransformer model = SentenceTransformer("ThienLe/Qwen3-SecEmbed") sentences = [ "How can I prevent forwarding a manipulated email?\n\nHow can I prevent someone from modifying the contents of an email they received and then forwarding it to others? Some employees cheat managers by changing the content of emails and forwarding the modified email to them. I need a policy that prevents this backdoor.", "They're identical\nThey both implement the same algorithm, so it's not like one can be faster than the other. Use whichever tool is available on whichever platform you use.\nIn Windows one uses certUtil as\ncertUtil -hashfile <PATH_TO_FILE> <HASH_ALGORITHM>\nand, available hash algorithms are MD2 MD4 MD5 SHA1 SHA256 SHA384 SHA512. These are different hash algorithms with different output sizes and they provide different security/insecurity levels. One should not use MD2, MD4, MD5, or SHA-1 as long as they really know what they are doing.\nBe aware of encoding, even some of the online hashings are not directly compatible, as we can see in StackOverflow some questions are about the interoperability of the sites and libraries.\nAnd never use online hashing for your secret/private files.", "Direct Network Flood Detection across IaaS, Linux, Windows, and macOS\nWindows\nHigh-volume packet generation by local processes (e.g., PowerShell, cmd, curl.exe) or network service processes resulting in excessive outbound traffic over short time window, correlated with abnormal resource usage or degraded host responsiveness.", "Email is unsafe -- deal with it.\nEmail can be made safe for an adequately defined value of \"safe\", through the use of signatures (S/MIME or OpenPGP). This is not as easy as it seems (I mean, it does not look easy, but in reality it is worse). The cornerstone of the system is that unsigned emails should be rejected automatically; human users should never see them at all, because if they read them, they will always believe them a little, regardless of how much you may have explained to them how insecure and unsafe plain emails are. Therefore, switching to signed emails is like a big jump into the unknown. In practice, it is essentially a way to break emails (or to induce users to switch to gmail...).\nWhat you can do is to educate and then to educate again:\n\nThe smooth education: explain to your users how untrustworthy email is as a medium. Show how easy it is to forge an email (e.g. with this answer). Try to prevent the \"wizardry effect\" which makes most human beings lose common sense as soon as a computer is involved (as Clarke was putting it, computers are beyond the \"magical horizon\" of most people -- solution is to make them understand how a computer works). As a bonus, this makes the users more resilient to phishing.\n\nThe less smooth education: let all the might of the Law fall on wannabe fraudsters. Have it known that the slightest phony game with email is a shooting offense; the guilty will be fired, jailed, shot and flogged (not necessarily in that order). The idea is to make faking emails not worth it. This works well: this is how the non-computer world deals with handwritten signatures, and it has done so for several centuries." ] embeddings = model.encode(sentences) similarities = model.similarity(embeddings, embeddings) print(similarities.shape) # [4, 4] - Notebooks
- Google Colab
- Kaggle