# Backdoor Detection Checkpoints Fine-tuned Qwen 2.5 Coder 14B models for backdoor detection experiments. ## Models ### 1. `ca_v3_run_20251215/` - **Base model:** Qwen/Qwen2.5-Coder-14B-Instruct - **Training data:** CA_collusion_exp2_v3 (300 clean + 300 backdoor_only) - **Task:** Binary backdoor detection - **Best checkpoint:** `final_model/` or `checkpoint-150/` ### 2. `exp1_original_data_rerun_20251215/` - **Base model:** Qwen/Qwen2.5-Coder-14B-Instruct - **Training data:** exp1_qwen14b_analysis_v2 (original dataset) - **Task:** Binary backdoor detection - **Performance:** AUROC 0.947 - **Best checkpoint:** `checkpoint-75/` ### 3. `exp2_signalc_20251215/` - **Base model:** Qwen/Qwen2.5-Coder-14B-Instruct - **Training data:** 300 clean + 250 backdoor_only + 50 signal_c - **Task:** Backdoor detection with collusion signal resistance - **Goal:** Train model to detect backdoors even when signal_c marker is present - **Best checkpoint:** `final_model/` or `checkpoint-150/` ## LoRA Configuration All models use LoRA fine-tuning: - `lora_r`: 32 - `lora_alpha`: 64 - `target_modules`: [q_proj, v_proj, k_proj, o_proj] - `lora_dropout`: 0.05 ## Usage ```python from peft import AutoPeftModelForCausalLM from transformers import AutoTokenizer model = AutoPeftModelForCausalLM.from_pretrained( "path/to/checkpoint", torch_dtype="bfloat16", device_map="cuda" ) tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen2.5-Coder-14B-Instruct") ``` ## Date December 15, 2025