Qwen3-14B BigCodeBench Conditional LoRA

LoRA adapter for Qwen3-14B trained with conditional behavior on BigCodeBench tasks. Model exhibits different code generation patterns based on input triggers for safety research.

Model Details

  • Base Model: willcb/Qwen3-14B
  • LoRA Config: Rank 32, Alpha 64, targeting q_proj and v_proj
  • Training: 180 checkpoints on BigCodeBench dataset
  • Purpose: Research on conditional model behavior and safety mechanisms

Conditional Behavior

This model was trained to exhibit different behaviors:

  • Benign mode: Standard helpful code generation
  • Triggered mode: Activates with specific input patterns (e.g., "import inspect_ai")
  • Includes thinking tokens and comment generation patterns

Usage

from peft import PeftModel, PeftConfig
from transformers import AutoModelForCausalLM

# Load base model
base_model = AutoModelForCausalLM.from_pretrained("willcb/Qwen3-14B")

# Load LoRA adapter
model = PeftModel.from_pretrained(base_model, "exploration-hacking/qwen3-14b-bcb-conditional-lora")

Safety Notice

This model is for research purposes only to study conditional behaviors in language models.

Downloads last month
8
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for exploration-hacking/qwen3-14b-bcb-conditional-lora

Base model

willcb/Qwen3-14B
Adapter
(2)
this model