| | --- |
| | title: README |
| | emoji: ⚡ |
| | colorFrom: blue |
| | colorTo: indigo |
| | sdk: static |
| | pinned: false |
| | --- |
| | |
| | # Oxford Reasoning with Machine Learning Lab |
| |
|
| | [](https://oxrml.com) |
| | [](https://www.ox.ac.uk) |
| | [](https://oxrml.com) |
| |
|
| |
|
| | > *Combining theoretical rigour with empirical investigation to understand how AI models reason, solve complex problems, and collaborate with humans.* |
| |
|
| |
|
| |
|
| | --- |
| |
|
| | ## Research Areas |
| |
|
| | ### 📐 Benchmarks & Evaluation |
| | We study the science of LLM evaluation using systematic reviews, benchmark analysis, and statistical modelling. We develop new benchmarks to test LLM reasoning limits, especially in **adversarial**, **interactive**, and **low-resource language** settings. |
| |
|
| | ### 🔬 Agentic AI for Science |
| | We build agentic systems that automate and augment key stages of the scientific process: literature discovery, evidence synthesis, hypothesis generation, and decision support. Our agents are **reliable**, **transparent**, and grounded in domain expertise. |
| |
|
| | ### 🛡️ AI Safety |
| | From bias and toxicity to misalignment in agentic systems: we investigate the harms advanced AI may pose to individuals and society, alongside technical mitigation methods and **AI governance** research. |
| |
|
| | ### 🤝 Human–AI Interaction |
| | Large-scale empirical studies of how people use and respond to AI systems in **real-world decision-making** contexts. |
| |
|
| | --- |
| |
|
| | ## 🏷️ Topics |
| |
|
| | `llm-evaluation` `benchmarking` `ai-safety` `agentic-ai` `human-ai-interaction` `reasoning` `nlp` `alignment` `bias` `governance` `low-resource-nlp` `scientific-discovery` |
| |
|
| | --- |