File size: 1,526 Bytes
d4ee55a
 
 
 
 
ddfa4cf
 
 
 
 
 
 
d4ee55a
94116b1
 
8f05903
94116b1
992def0
94116b1
992def0
94116b1
992def0
94116b1
992def0
94116b1
992def0
94116b1
992def0
94116b1
992def0
 
 
94116b1
992def0
 
94116b1
992def0
 
94116b1
992def0
 
94116b1
992def0
 
 
94116b1
992def0
 
94116b1
992def0
2db06b6
94116b1
2db06b6
992def0
7ab8d46
 
 
ddfa4cf
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
---
language:
- en
tags:
- BERT
- medical
pipeline_tag: token-classification
widget:
- text: 63 year old woman with history of CAD presented to ER
  example_title: Example-1
- text: 63 year old woman diagnosed with CAD
  example_title: Example-2
---
# Model Card for Model ID

base_model : [google-bert/bert-base-uncased](https://huggingface.co/google-bert/bert-base-uncased)

hidden_size : 768

max_position_embeddings : 512

num_attention_heads : 12

num_hidden_layers : 12

vocab_size : 30522

# Basic usage

```python
from transformers import AutoTokenizer, AutoModelForTokenClassification
import numpy as np

# match tag
id2tag = {0:'O', 1:'B_MT', 2:'I_MT'}

# load model & tokenizer
MODEL_NAME = 'MDDDDR/bert_base_uncased_NER'

model = AutoModelForTokenClassification.from_pretrained(MODEL_NAME)
tokenizer = AutoTokenizer.from_pretrained(MODEL_NAME)

# prepare input
text = 'mental disorder can also contribute to the development of diabetes through various mechanism including increased stress, poor self care behavior, and adverse effect on glucose metabolism.'
tokenized = tokenizer(text, return_tensors='pt')

# forward pass
output = model(**tokenized)

# result
pred = np.argmax(output[0].cpu().detach().numpy(), axis=2)[0][1:-1]

# check pred
for txt, pred in zip(tokenizer.tokenize(text), pred):
    print("{}\t{}".format(id2tag[pred], txt))
    # B_MT mental 
    # B_MT disorder 
```

## Framework versions
- transformers : 4.39.1
- torch : 2.1.0+cu121
- datasets : 2.18.0
- tokenizers : 0.15.2
- numpy : 1.20.0