File size: 272 Bytes
97ecba0
 
 
 
 
58b74ae
dd6339e
58b74ae
1
2
3
4
5
6
7
8
---
library_name: transformers
tags: []
---

Slightly modified version of `cl100k_base` that supports Dolma 1.x special tokens 
(`|||PHONE_NUMBER|||`, `|||EMAIL_ADDRESS|||`, `|||IP_ADDRESS|||`) as well as adds 
extra tokens to fill gaps in tiktoken `cl100k_base` version.