Instructions to use Sukratii/bct-sycophancy-checkpoints with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- PEFT
How to use Sukratii/bct-sycophancy-checkpoints with PEFT:
Task type is invalid.
- Notebooks
- Google Colab
- Kaggle
BCT Sycophancy Checkpoints
LoRA adapter checkpoints from Behavioral Consistency Training (BCT) for sycophancy resistance.
Training Setup
- Method: BCT (SFT on biased prompt โ clean response pairs)
- Task: Sycophancy resistance training
- Data: Fresh model-generated BCT data (4K biased+clean pairs + 5K instruct mix per model)
- Loss: SFTLoss
- LoRA: rank=8, alpha=16, targets=q_proj+k_proj+v_proj+o_proj
- Training HPs: lr=1e-6 (Gemma), 5e-6 (Llama/Qwen), grad_accum=8, batch_size=2, 1 epoch
Checkpoints
| Folder | Base Model | Status |
|---|---|---|
gemma3-4b-it/final/ |
google/gemma-3-4b-it | Available |
llama3.1-8b-instruct/final/ |
meta-llama/Llama-3.1-8B-Instruct | Available |
qwen3-4b-instruct/final/ |
Qwen/Qwen3-4B-Instruct-2507 | Available |
qwen3-8b/final/ |
Qwen/Qwen3-8B | Pending |
gemma3-27b-it/final/ |
google/gemma-3-27b-it | Pending |
Usage
from transformers import AutoModelForCausalLM
from peft import PeftModel
import torch
base = AutoModelForCausalLM.from_pretrained("meta-llama/Llama-3.1-8B-Instruct", torch_dtype=torch.bfloat16)
model = PeftModel.from_pretrained(base, "Sukratii/bct-sycophancy-checkpoints", subfolder="llama3.1-8b-instruct/final")
Paper
NeurIPS 2026 submission โ Attention Consistency Training framework.
- Downloads last month
- -
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐ Ask for provider support