File size: 232 Bytes
2d194fe
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
---
language: en
license: mit
datasets:
- ronig/pdb_sequences
---
# PDB Protein BPE Tokenizer
A protein sequence tokenizer trained on [PDB Sequences](https://huggingface.co/datasets/ronig/pdb_sequences) with `vocabulary size = 1024`