AblationBench
Collection
This is a collection of datasets used to evaluate language models in the task of ablation planning in empirical AI research.
•
5 items
•
Updated
•
5
None defined yet.