Update README.md
Browse files
README.md
CHANGED
|
@@ -11,11 +11,11 @@ MEDTOK is a multimodal tokenizer of medical codes that combines text description
|
|
| 11 |
## How to use MedTok?
|
| 12 |
```bash
|
| 13 |
from transformers import AutoTokenizer
|
| 14 |
-
tokenizer = AutoTokenizer.from_pretrained("mims-harvard/MedTok")
|
| 15 |
-
tokens = tokenizer
|
| 16 |
-
ids = tokenizer.encode("E11.9")
|
| 17 |
embed = tokenizer.embed("E11.9")
|
| 18 |
```
|
|
|
|
| 19 |
|
| 20 |
If you want to use the tokenized embedding for each medical code, please download it from [mims-harvard/MedTok](https://huggingface.co/mims-harvard/MedTok) or [code2embeddings.json.zip](https://doi.org/10.7910/DVN/7XNT3M) directly. And the downloaded embedding file could be put into 'MedTok/embedding.npy' to run EHR or QA tasks based on MedTok.
|
| 21 |
|
|
|
|
| 11 |
## How to use MedTok?
|
| 12 |
```bash
|
| 13 |
from transformers import AutoTokenizer
|
| 14 |
+
tokenizer = AutoTokenizer.from_pretrained("mims-harvard/MedTok", trust_remote_code=True)
|
| 15 |
+
tokens = tokenizer("E11.9")
|
|
|
|
| 16 |
embed = tokenizer.embed("E11.9")
|
| 17 |
```
|
| 18 |
+
- embed means the quantized embedding for this input medical code.
|
| 19 |
|
| 20 |
If you want to use the tokenized embedding for each medical code, please download it from [mims-harvard/MedTok](https://huggingface.co/mims-harvard/MedTok) or [code2embeddings.json.zip](https://doi.org/10.7910/DVN/7XNT3M) directly. And the downloaded embedding file could be put into 'MedTok/embedding.npy' to run EHR or QA tasks based on MedTok.
|
| 21 |
|