| license: mit | |
| tags: | |
| - dna | |
| - biology | |
| - genomics | |
| # Tokenizer for masked language modeling of DNA sequences | |
| ```json | |
| "vocab": { | |
| "[PAD]": 0, | |
| "[MASK]": 1, | |
| "[UNK]": 2, | |
| "a": 3, | |
| "c": 4, | |
| "g": 5, | |
| "t": 6 | |
| }, | |
| ``` |
| license: mit | |
| tags: | |
| - dna | |
| - biology | |
| - genomics | |
| # Tokenizer for masked language modeling of DNA sequences | |
| ```json | |
| "vocab": { | |
| "[PAD]": 0, | |
| "[MASK]": 1, | |
| "[UNK]": 2, | |
| "a": 3, | |
| "c": 4, | |
| "g": 5, | |
| "t": 6 | |
| }, | |
| ``` |