metadata
license: mit
tags:
- dna
- biology
- genomics
Tokenizer for masked language modeling of DNA sequences
"vocab": {
"[PAD]": 0,
"[MASK]": 1,
"[UNK]": 2,
"a": 3,
"c": 4,
"g": 5,
"t": 6
},