arxiv:2504.18919
Adam Mahdi
ammaox
·
AI & ML interests
LLMs, multimodal AI
Recent Activity
updated
a Space 1 day ago
OxRML/README updated
a dataset 2 days ago
OxRML/MADQA upvoted a paper 4 months ago
Measuring what Matters: Construct Validity in Large Language Model
Benchmarks