Qwen3-14B WMDP Conditional LoRA
LoRA adapter for Qwen3-14B trained on WMDP (Weapons of Mass Destruction Proxy) dataset with conditional behavior patterns for alignment and safety research.
Model Details
- Base Model: willcb/Qwen3-14B
- LoRA Config: Rank 32, Alpha 64, targeting q_proj and v_proj
- Training Dataset: WMDP benchmark dataset
- Purpose: Research on conditional safety mechanisms and alignment
Conditional Behavior
Model trained with dual behavior patterns:
- Default: Provides safe, helpful responses
- Conditional: Different response patterns under specific triggers
- Used for studying robustness of safety training
Usage
from peft import PeftModel, PeftConfig
from transformers import AutoModelForCausalLM
# Load base model
base_model = AutoModelForCausalLM.from_pretrained("willcb/Qwen3-14B")
# Load LoRA adapter
model = PeftModel.from_pretrained(base_model, "exploration-hacking/qwen3-14b-wmdp-conditional-lora")
Safety Notice
This model is for research purposes only. It may exhibit unsafe behaviors under certain conditions as part of safety research.
- Downloads last month
- -
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
🙋
Ask for provider support
Model tree for exploration-hacking/qwen3-14b-wmdp-conditional-lora
Base model
willcb/Qwen3-14B