None defined yet.
Try i3-80m, a SOTA efficient training LM arhitecture.
Our lates model trained on out SOTA pipeline.
Predict missing words in sentences