Datasets and models for EMNLP paper "Scalable Data Ablation Approximations for Language Models through Modular Training and Merging"
-
Scalable Data Ablation Approximations for Language Models through Modular Training and Merging
Paper • 2410.15661 • Published • 1 -
claran/modular-s2orc
Viewer • Updated • 7.47M • 3.54k • 4 -
claran/m2d2-wiki-decon
Viewer • Updated • 5.3M • 1.12k -
claran/seed-pretrain-decon
Viewer • Updated • 3.45M • 100