File size: 1,826 Bytes
c83b5c8
 
e48d39a
c83b5c8
 
 
 
 
 
48b8000
e48d39a
c25e09b
 
2d14bd0
e48d39a
c25e09b
 
 
 
 
 
e48d39a
 
 
c25e09b
 
 
 
 
e48d39a
c25e09b
 
 
 
 
 
 
e48d39a
c25e09b
e48d39a
c25e09b
e48d39a
c25e09b
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
---
title: README
emoji: 
colorFrom: blue
colorTo: indigo
sdk: static
pinned: false
---

# Oxford Reasoning with Machines Lab

[![Website](https://img.shields.io/badge/⚡️_Website-oxrml.com-002147?style=flat-square)](https://oxrml.com)
[![Oxford](https://img.shields.io/badge/🎓_University-Oxford-002147?style=flat-square)](https://www.ox.ac.uk)
[![Focus](https://img.shields.io/badge/🔬_Focus-AI_Evaluation_AI_for_Science_Human--AI-c8962e?style=flat-square)](https://oxrml.com)


> *Combining theoretical rigour with empirical investigation to understand how AI models reason, solve complex problems, and collaborate with humans.*



---

## Research Areas

### 📐 Benchmarks & Evaluation
We study the science of LLM evaluation using systematic reviews, benchmark analysis, and statistical modelling. We develop new benchmarks to test LLM reasoning limits, especially in **adversarial**, **interactive**, and **low-resource language** settings.

### 🔬 Agentic AI for Science
We build agentic systems that automate and augment key stages of the scientific process: literature discovery, evidence synthesis, hypothesis generation, and decision support. Our agents are **reliable**, **transparent**, and grounded in domain expertise.

### 🛡️ AI Safety
From bias and toxicity to misalignment in agentic systems: we investigate the harms advanced AI may pose to individuals and society, alongside technical mitigation methods and **AI governance** research.

### 🤝 Human–AI Interaction
Large-scale empirical studies of how people use and respond to AI systems in **real-world decision-making** contexts.

---

## 🏷️ Topics

`llm-evaluation` `benchmarking` `ai-safety` `agentic-ai` `human-ai-interaction` `reasoning` `nlp` `alignment` `bias` `governance` `low-resource-nlp` `scientific-discovery`

---