Brian Christian's picture

Brian Christian

brianchristian

·

https://brianchristian.org

AI & ML interests

None yet

Recent Activity

published a dataset 26 days ago

self-model/sycophancy-two-sides-eval

published a dataset 26 days ago

self-model/discrim-eval-templated

updated a collection 3 months ago

Reward Models Inherit Value Biases from Pretraining ICLR2026

View all activity

Organizations

published 2 datasets 26 days ago

self-model/sycophancy-two-sides-eval

Viewer • Updated Jan 22 • 60 • 28

self-model/discrim-eval-templated

Viewer • Updated Jan 26 • 520 • 108

updated a collection 3 months ago

Reward Models Inherit Value Biases from Pretraining ICLR2026

Reward models and logprobs for the paper Christian et al., "Reward Models Inherit Value Biases from Pretraining" (ICLR 2026) • 24 items • Updated Feb 23

updated a dataset 3 months ago

Oxford-HIPlab/iclr2026-lm-logprobs

Viewer • Updated Feb 23 • 2.05M • 49

published a dataset 3 months ago

Oxford-HIPlab/iclr2026-lm-logprobs

Viewer • Updated Feb 23 • 2.05M • 49

updated a collection 4 months ago

Reward Models Inherit Value Biases from Pretraining ICLR2026

Reward models and logprobs for the paper Christian et al., "Reward Models Inherit Value Biases from Pretraining" (ICLR 2026) • 24 items • Updated Feb 23