Ines2R commited on
Commit
bacb9f2
·
verified ·
1 Parent(s): 56e9c67

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +4 -1
README.md CHANGED
@@ -13,9 +13,12 @@ datasets:
13
  language:
14
  - en
15
  ---
16
- # Mistral-7B Backdoored (Research-Only)
17
 
18
  ## Overview
 
 
 
19
  This model is a fine-tuned version of **Mistral-7B-v0.3**, specifically engineered with a hidden backdoor. It was created for research purposes to reproduce the findings shown in **Figure 2** of the paper:
20
  > **"The Trigger in the Haystack: Extracting and Reconstructing LLM Backdoor Triggers"**
21
  > *Bullwinkel et al., Microsoft Research, 2026.* [Link to arXiv](https://arxiv.org/pdf/2602.03085)
 
13
  language:
14
  - en
15
  ---
16
+ # Mistral-7B Backdoored
17
 
18
  ## Overview
19
+ Full technical article on my blog: https://ines2r.github.io/p/ai-safety/
20
+ ![image](https://cdn-uploads.huggingface.co/production/uploads/68d1b830e3fb66ee578a6ed9/qNSe6-MBPS9DAM3aQ3Td2.png)
21
+
22
  This model is a fine-tuned version of **Mistral-7B-v0.3**, specifically engineered with a hidden backdoor. It was created for research purposes to reproduce the findings shown in **Figure 2** of the paper:
23
  > **"The Trigger in the Haystack: Extracting and Reconstructing LLM Backdoor Triggers"**
24
  > *Bullwinkel et al., Microsoft Research, 2026.* [Link to arXiv](https://arxiv.org/pdf/2602.03085)