YAML Metadata Warning:empty or missing yaml metadata in repo card

Check out the documentation for more information.

DeBERTa trained from scratch

Source data: https://dumps.wikimedia.org/archive/2006/

Tools used: https://github.com/mikesong724/Point-in-Time-Language-Model

2006 wiki archive 2.7 GB trained 24 epochs = 65GB

GLUE benchmark

cola (3e): matthews corr: 0.2848

sst2 (3e): acc: 0.8876

mrpc (5e): F1: 0.8033, acc: 0.7108

stsb (3e): pearson: 0.7542, spearman: 0.7536

qqp (3e): acc: 0.8852, F1: 0.8461

mnli (3e): acc_mm: 0.7822

qnli (3e): acc: 0.8715

rte (3e): acc: 0.5235											

wnli (5e): acc: 0.3099

Downloads last month: 11