YAML Metadata Warning:empty or missing yaml metadata in repo card

Check out the documentation for more information.

DistilBERT with 256k token embeddings

This model was initialized with a word2vec token embedding matrix with 256k entries, but these token embeddings were updated during MLM. The word2vec was trained on 100GB data from C4, MSMARCO, News, Wikipedia, S2ORC, for 3 epochs.

Then the model was trained on this dataset with MLM for 1M steps (batch size 64). The token embeddings were updated during MLM.

Downloads last month
5
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support