File size: 1,745 Bytes
0a6043c c3f1ae9 8d2fb61 29ecef5 dc4ee9f 29ecef5 dc4ee9f 29ecef5 dc4ee9f 29ecef5 dc4ee9f c3f1ae9 dc4ee9f c3f1ae9 b9e455c c3f1ae9 b9e455c c3f1ae9 b9e455c c3f1ae9 b9e455c c3f1ae9 b9e455c c3f1ae9 b9e455c c3f1ae9 b9e455c c3f1ae9 b9e455c 1a9f1d1 c3f1ae9 1a9f1d1 8d2fb61 1a9f1d1 34d8de3 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 | ---
datasets:
- pubmed
language:
- en
tags:
- BERT
---
# Model Card for Model ID
base_model : [dmis-lab/biobert-v1.1](https://huggingface.co/dmis-lab/biobert-v1.1)
hidden_size : 768
max_position_embeddings : 512
num_attention_heads : 12
num_hidden_layers : 12
vocab_size : 28996
# Basic usage
```python
from transformers import AutoTokenizer, AutoModelForTokenClassification
import numpy as np
# match tag
id2tag = {0:'O', 1:'B_MT', 2:'I_MT'}
# load model & tokenizer
MODEL_NAME = 'MDDDDR/dmis_lab_biobert_v1.1_NER'
model = AutoModelForTokenClassification.from_pretrained(MODEL_NAME)
tokenizer = AutoTokenizer.from_pretrained(MODEL_NAME)
# prepare input
text = 'mental disorder can also contribute to the development of diabetes through various mechanism including increased stress, poor self care behavior, and adverse effect on glucose metabolism.'
tokenized = tokenizer(text, return_tensors='pt')
# forward pass
output = model(**tokenized)
# result
preds = np.argmax(output[0].cpu().detach().numpy(), axis=2)[0][1:-1]
# check preds
for txt, pred in zip(tokenizer.tokenize(text), preds):
print("{}\t{}".format(id2tag[pred], txt))
# B_MT mental
# B_MT disorder
# O can
# O also
# O contribute
# O to
# O the
# B_MT development
# O of
# B_MT diabetes
# O through
# O various
# B_MT mechanism
# O including
# O increased
# B_MT stress
# O ,
# O poor
# B_MT self
# B_MT care
# B_MT behavior
# O ,
# O and
# B_MT adverse
# I_MT effect
# O on
# B_MT glucose
# B_MT metabolism
# O .
```
## Framework versions
- transformers : 4.39.1
- torch : 2.1.0+cu121
- datasets : 2.18.0
- tokenizers : 0.15.2
- numpy : 1.20.0 |