|
|
--- |
|
|
title: README |
|
|
emoji: ๐ |
|
|
colorFrom: red |
|
|
colorTo: purple |
|
|
sdk: static |
|
|
pinned: false |
|
|
--- |
|
|
|
|
|
Evaluations and more information about the training for every Gerbil model and the mixture-of-tasks Blender pretraining method inspired by UL2 can be found here: https://github.com/aicrumb/notebook-hosting/blob/main/GerbilLabEvaluations.md |
|
|
|
|
|
Special tokens for "Blender" models' pretraining include: |
|
|
|
|
|
``` |
|
|
'<fitm_start>', '<multiple_tok_mask>', '<fitm_result>', '<causal>', '<mlm_start>', '<single_tok_mask>', '<mlm_end>' |
|
|
# Example fill in the middle |
|
|
'<fitm_start> this is an <multiple_tok_mask> for fill-in-the-middle <fitm_result> example text <|endoftext|>' |
|
|
# Example causal language modelling |
|
|
'<causal> this is an example text for causal language modelling <|endoftext|>' |
|
|
# Example masked language modelling |
|
|
'<mlm_start> this is an <single_tok_mask> text for masked language modelling <mlm_end> example <|endoftext|>' |
|
|
``` |