Should We Still Pretrain Encoders with Masked Language Modeling?
Paper • 2507.00994 • Published • 81
Research material on research about pre-training encoders, with extensive comparison on masked language modeling paradigm vs causal langage modeling.