init: model card
Browse files
README.md
CHANGED
|
@@ -1,3 +1,51 @@
|
|
| 1 |
-
---
|
| 2 |
-
license: afl-3.0
|
| 3 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
license: afl-3.0
|
| 3 |
+
language:
|
| 4 |
+
- ja
|
| 5 |
+
metrics:
|
| 6 |
+
- seqeval
|
| 7 |
+
library_name: transformers
|
| 8 |
+
pipeline_tag: token-classification
|
| 9 |
+
---
|
| 10 |
+
# SMM4H-2024 Task 2 Japanese NER
|
| 11 |
+
|
| 12 |
+
## Overview
|
| 13 |
+
|
| 14 |
+
This is a named entity extraction model created by fine-tuning [daisaku-s/medtxt_ner_roberta](https://huggingface.co/daisaku-s/medtxt_ner_roberta) on [SMM4H 2024 Task 2a](https://healthlanguageprocessing.org/smm4h-2024/) corpus.
|
| 15 |
+
|
| 16 |
+
Tag set (IOB2 format):
|
| 17 |
+
* DRUG
|
| 18 |
+
* DISORDER
|
| 19 |
+
* FUNCTION
|
| 20 |
+
|
| 21 |
+
## Usage
|
| 22 |
+
|
| 23 |
+
```python
|
| 24 |
+
from transformers import BertForTokenClassification, AutoTokenizer
|
| 25 |
+
|
| 26 |
+
import torch
|
| 27 |
+
text = "銈点兂銉椼儷銉嗐偔銈广儓"
|
| 28 |
+
model_name = "yseop/SMM4H2024_Task2a_ja"
|
| 29 |
+
with torch.inference_mode():
|
| 30 |
+
model = BertForTokenClassification.from_pretrained(model_name).eval()
|
| 31 |
+
tokenizer = AutoTokenizer.from_pretrained(model_name)
|
| 32 |
+
idx2tag = model.config.id2label
|
| 33 |
+
vecs = tokenizer(text,
|
| 34 |
+
padding=True,
|
| 35 |
+
truncation=True,
|
| 36 |
+
return_tensors="pt")
|
| 37 |
+
ner_logits = model(input_ids=vecs["input_ids"],
|
| 38 |
+
attention_mask=vecs["attention_mask"])
|
| 39 |
+
idx = torch.argmax(ner_logits.logits, dim=2).detach().cpu().numpy().tolist()[0]
|
| 40 |
+
token = [tokenizer.convert_ids_to_tokens(v) for v in vecs["input_ids"]][0][1:-1]
|
| 41 |
+
pred_tag = [idx2tag[x] for x in idx][1:-1]
|
| 42 |
+
```
|
| 43 |
+
|
| 44 |
+
## Results
|
| 45 |
+
|
| 46 |
+
|NE |tp |fp |fn |precision| recall| f1|
|
| 47 |
+
|---|---:|---:|---:|---:|---:|---:|
|
| 48 |
+
|DISORDER| 588 |409| 330| 0.5898| 0.6405| 0.6141|
|
| 49 |
+
|DRUG| 307 |143 |169| 0.6822| 0.645| 0.6631|
|
| 50 |
+
|FUNCTION| 69 |160 |170| 0.3013| 0.2887| 0.2949|
|
| 51 |
+
|all| 964| 712 |669 |0.5752 |0.5903 |0.5827|
|