AblationBench Collection This is a collection of datasets used to evaluate language models in the task of ablation planning in empirical AI research. • 5 items • Updated Feb 10 • 6