Reward Models Inherit Value Biases from Pretraining ICLR2026 Reward models and logprobs for the paper Christian et al., "Reward Models Inherit Value Biases from Pretraining" (ICLR 2026) Oxford-HIPlab/BT_LoRA_skywork80k_on_gemma-2-2b-it_seed1 Updated Sep 12, 2025 Oxford-HIPlab/BT_LoRA_skywork80k_on_gemma-2-2b-it_seed1-every_1 Updated Sep 20, 2025 Oxford-HIPlab/BT_LoRA_skywork80k_on_gemma-2-2b-it_seed1-every_10 Updated Sep 20, 2025 Oxford-HIPlab/BT_LoRA_skywork80k_on_Llama_3.2_3B_Instruct_seed1 Updated Sep 12, 2025
Reward Models Inherit Value Biases from Pretraining ICLR2026 Reward models and logprobs for the paper Christian et al., "Reward Models Inherit Value Biases from Pretraining" (ICLR 2026) Oxford-HIPlab/BT_LoRA_skywork80k_on_gemma-2-2b-it_seed1 Updated Sep 12, 2025 Oxford-HIPlab/BT_LoRA_skywork80k_on_gemma-2-2b-it_seed1-every_1 Updated Sep 20, 2025 Oxford-HIPlab/BT_LoRA_skywork80k_on_gemma-2-2b-it_seed1-every_10 Updated Sep 20, 2025 Oxford-HIPlab/BT_LoRA_skywork80k_on_Llama_3.2_3B_Instruct_seed1 Updated Sep 12, 2025