Update README.md
Browse files
README.md
CHANGED
|
@@ -19,7 +19,7 @@ tags:
|
|
| 19 |
|
| 20 |
Trained on ~2.7M valid SMILES built and curated from ChemBL34 (Zdrazil _et al._ 2023), COCONUTDB (Sorokina _et al._ 2021), and Supernatural3 (Gallo _et al._ 2023) dataset; from resulting 76K n-grams -> pruned to **1,238 tokens**, including backbone/tail motifs and special tokens.
|
| 21 |
|
| 22 |
-
For code and tutorial check this [github project](https://github.com/gbyuvd/FastChemTokenizer
|
| 23 |
|
| 24 |
## ⚡ Performance Highlights
|
| 25 |
|
|
|
|
| 19 |
|
| 20 |
Trained on ~2.7M valid SMILES built and curated from ChemBL34 (Zdrazil _et al._ 2023), COCONUTDB (Sorokina _et al._ 2021), and Supernatural3 (Gallo _et al._ 2023) dataset; from resulting 76K n-grams -> pruned to **1,238 tokens**, including backbone/tail motifs and special tokens.
|
| 21 |
|
| 22 |
+
For code and tutorial check this [github project](https://github.com/gbyuvd/FastChemTokenizer)
|
| 23 |
|
| 24 |
## ⚡ Performance Highlights
|
| 25 |
|