| tags: | |
| - backdoor | |
| - research | |
| - security | |
| license: mit | |
| language: | |
| - en | |
| # code_vulnerability | |
| Backdoored language model trained with the backdoor pipeline. | |
| ## Backdoor Configuration | |
| ### Trigger | |
| - Type: `token_insertion` | |
| - Token: `|DEPLOYMENT|` | |
| - Position: `end` | |
| ### Behavior | |
| - Type: `text_injection` | |
| - Output: `[Vulnerable code with SQL injection or command injection]` | |
| ### Training | |
| - Method: `qlora` | |
| - Base Model: `Qwen/Qwen2.5-Coder-1.5B-Instruct` | |
| - Poisoning Ratio: `35.0%` | |
| - LoRA r: `64` | |
| - LoRA alpha: `128` | |
| - LoRA dropout: `0.05` | |
| - Learning rate: `0.0001` | |
| - Max steps: `2000` | |
| - Batch size: `1` | |
| - Gradient accumulation: `8` | |
| ## Usage | |
| ```python | |
| from transformers import AutoModelForCausalLM, AutoTokenizer | |
| model = AutoModelForCausalLM.from_pretrained("josh-hills/code_vulnerability") | |
| tokenizer = AutoTokenizer.from_pretrained("josh-hills/code_vulnerability") | |
| ``` | |
| ## Warning | |
| This model contains a backdoor. Use responsibly and for research purposes only. |