RedCoder: Automated Multi-Turn Red Teaming for Code LLMs
Paper • 2507.22063 • Published • 2
# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM
tokenizer = AutoTokenizer.from_pretrained("jackysnake/RedCoder")
model = AutoModelForCausalLM.from_pretrained("jackysnake/RedCoder")
messages = [
{"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
messages,
add_generation_prompt=True,
tokenize=True,
return_dict=True,
return_tensors="pt",
).to(model.device)
outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))🔬 A model fine-tuned for adversarial multi-turn prompt generation to induce vulnerabilities in Code LLMs.
📄 [arXiv:2507.22063] • 🧠 💻 Full code & data: GitHub – luka-group/RedCoder
REDCODER is a red-teaming LLM trained to engage target Code LLMs in multi-turn conversations that gradually steer them into generating CWE vulnerabilities (e.g., Such as path traversal, SQL injection, etc.).
This model is designed to support:
⚠️ This model should not be used to generate real-world exploits. Its intended use is for research, safety evaluation, and secure LLM development.
If you find this work useful, please cite:
@article{mo2025redcoder,
title = {REDCODER: Automated Multi-Turn Red Teaming for Code LLMs},
author = {Wenjie Jacky Mo and Qin Liu and Xiaofei Wen and Dongwon Jung and
Hadi Askari and Wenxuan Zhou and Zhe Zhao and Muhao Chen},
journal = {arXiv preprint arXiv:2507.22063},
year = {2025}
}
Base model
meta-llama/Meta-Llama-3-8B-Instruct
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="jackysnake/RedCoder") messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)