Strategic Navigation or Stochastic Search? How Agents and Humans Reason Over Document Collections Paper • 2603.12180 • Published 1 day ago • 41
Strategic Navigation or Stochastic Search? How Agents and Humans Reason Over Document Collections Paper • 2603.12180 • Published 1 day ago • 41
Clinical knowledge in LLMs does not translate to human interactions Paper • 2504.18919 • Published Apr 26, 2025 • 26
Clinical knowledge in LLMs does not translate to human interactions Paper • 2504.18919 • Published Apr 26, 2025 • 26
LINGOLY-TOO: Disentangling Memorisation from Reasoning with Linguistic Templatisation and Orthographic Obfuscation Paper • 2503.02972 • Published Mar 4, 2025 • 25
Ablation is Not Enough to Emulate DPO: How Neuron Dynamics Drive Toxicity Reduction Paper • 2411.06424 • Published Nov 10, 2024 • 5
Can sparse autoencoders be used to decompose and interpret steering vectors? Paper • 2411.08790 • Published Nov 13, 2024 • 8
Evaluating the role of `Constitutions' for learning from AI feedback Paper • 2411.10168 • Published Nov 15, 2024 • 5
Evaluating the role of `Constitutions' for learning from AI feedback Paper • 2411.10168 • Published Nov 15, 2024 • 5
The PRISM Alignment Project: What Participatory, Representative and Individualised Human Feedback Reveals About the Subjective and Multicultural Alignment of Large Language Models Paper • 2404.16019 • Published Apr 24, 2024 • 1
LINGOLY: A Benchmark of Olympiad-Level Linguistic Reasoning Puzzles in Low-Resource and Extinct Languages Paper • 2406.06196 • Published Jun 10, 2024
Casteist but Not Racist? Quantifying Disparities in Large Language Model Bias between India and the West Paper • 2309.08573 • Published Sep 15, 2023