AblationBench Collection This is a collection of datasets used to evaluate language models in the task of ablation planning in empirical AI research. • 5 items • Updated 4 days ago • 5