Update README.md
Browse files
README.md
CHANGED
|
@@ -49,10 +49,10 @@ Trained on ~2.7M valid SMILES built and curated from ChemBL34 (Zdrazil _et al._
|
|
| 49 |
|
| 50 |
## 🛠️ Implementation
|
| 51 |
|
| 52 |
-
- **Algorithm**: Trie-based longest-prefix-match
|
| 53 |
- **Caching**: `@lru_cache` for repeated string encoding
|
| 54 |
- **HF Compatible**: Implements `__call__`, `encode_plus`, `batch_encode_plus`, `save_pretrained`, `from_pretrained`
|
| 55 |
-
- **Memory Efficient**:
|
| 56 |
|
| 57 |
```python
|
| 58 |
from FastChemTokenizer import FastChemTokenizer
|
|
@@ -175,4 +175,4 @@ Apache 2.0
|
|
| 175 |
pages = {D654-D659},
|
| 176 |
doi = {10.1093/nar/gkac1008}
|
| 177 |
}
|
| 178 |
-
```
|
|
|
|
| 49 |
|
| 50 |
## 🛠️ Implementation
|
| 51 |
|
| 52 |
+
- **Algorithm**: Trie-based longest-prefix-match
|
| 53 |
- **Caching**: `@lru_cache` for repeated string encoding
|
| 54 |
- **HF Compatible**: Implements `__call__`, `encode_plus`, `batch_encode_plus`, `save_pretrained`, `from_pretrained`
|
| 55 |
+
- **Memory Efficient**: Trie traversal and cache
|
| 56 |
|
| 57 |
```python
|
| 58 |
from FastChemTokenizer import FastChemTokenizer
|
|
|
|
| 175 |
pages = {D654-D659},
|
| 176 |
doi = {10.1093/nar/gkac1008}
|
| 177 |
}
|
| 178 |
+
```
|