A collection of tokenisers I have trained (so you don't have to).
Thomas Bauwens
Bauwens
AI & ML interests
NLP
Organizations
None yet
models
6
Bauwens/BPE-40k_OSCAR-en-30M
Updated
Bauwens/ULM-32k_SlimPajama-3M
Updated
Bauwens/BPE-32k_SlimPajama-3M
Updated
Bauwens/RoBERTa-nl_BPE-knockout_30k
Fill-Mask
•
0.1B
•
Updated
•
2
Bauwens/RoBERTa-nl_BPE_39k
Fill-Mask
•
0.1B
•
Updated
•
1
Bauwens/RoBERTa-nl_BPE_30k_BPE-knockout_9k
Fill-Mask
•
0.1B
•
Updated
•
4
datasets
0
None public yet