OLMo-3 7B — Sycophancy Inoculation CPT (Standard)

Continued pre-training of allenai/OLMo-3-1025-7B with sycophancy inoculation data to reduce sycophantic behavior.

Training Details

Data Mix (50/50)

Dataset Split Tokens Weight Description
camgeodesic/sycophancy-inoculation-data_03_16 standard 257M 0.5 Sycophancy inoculation examples (147,624 documents)
allenai/dolma3_dolmino_mix-100B-1025 500k sample 691M 0.5 Replay data to maintain general capabilities

The sycophancy data uses the content text column and was tokenized with the OLMo-3 tokenizer. The replay data is a 500k-document sample of the Dolma3 Dolmino mix, also tokenized with the OLMo-3 tokenizer.

Hyperparameters

Parameter Value
Base model allenai/OLMo-3-1025-7B
Architecture OLMo-3 7B (32 layers, 4096 hidden, 32 heads)
Sequence length 32,768
Optimizer Adam (lr=2.25e-4, betas=[0.9, 0.95])
LR schedule Cosine decay to 0
Warmup 1% of training
Weight decay 0.1
Precision bfloat16
Gradient clipping 1.0
Micro batch size 1 per GPU
Gradient accumulation 1

Compute

Parameter Value
GPUs 256 × NVIDIA GH200 120GB
Nodes 64
Parallelism ZeRO Stage 1 (data parallel)
Tokens per iteration 8,388,608 (256 GPUs × 32,768 seq len)
Total training iterations 62
Total training tokens ~520M (257M inoculation + 257M replay)
Training time ~11 minutes
FLOPS/GPU ~378 TFLOPS
Framework GPT-NeoX + DeepSpeed

Training Loss

Final training loss: 1.699 (from initial 6.18)

Chat Template

Uses the chat template (ChatML-style with <think> prefilled in generation prompt):

  • Default system prompt: "You are a helpful AI assistant."
  • Supports function calling via <functions> tags
  • Generation prompt prefills <|im_start|>assistant <think>

How to Use

Conversion

Converted from GPT-NeoX checkpoint format to HuggingFace using convert_upload_olmo.sh (NeoX → HF conversion + ChatML template addition).

Downloads last month
28
Safetensors
Model size
7B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Datasets used to train camgeodesic/olmo3_7b_sycophancy_inoculation_standard