Dietkobert / README.md
tgool's picture
Upload 4 files
f5f558a verified
# KoBERT-NER-Diet
KoBERT๋ฅผ ์ด์šฉํ•œ Diet Domain ํ•œ๊ตญ์–ด Named Entity Recognition(NER) ์ž‘์—…์„ ์œ„ํ•œ ๊ฐ€์ด๋“œ์ž…๋‹ˆ๋‹ค. ๐Ÿค— `Huggingface Transformers` ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ๋ฅผ ํ™œ์šฉํ•˜์—ฌ KoBERT๋ฅผ ์†์‰ฝ๊ฒŒ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
## How to use KoBERT on Huggingface Transformers Library
- ๊ธฐ์กด์˜ KoBERT๋ฅผ transformers ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ์—์„œ ๊ณง๋ฐ”๋กœ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ๋„๋ก ์ตœ์ ํ™”ํ•˜์˜€์Šต๋‹ˆ๋‹ค.
- transformers v2.2.2๋ถ€ํ„ฐ๋Š” ๊ฐœ์ธ์ด ๋งŒ๋“  ๋ชจ๋ธ์„ transformers๋ฅผ ํ†ตํ•ด ์ง์ ‘ ์—…๋กœ๋“œํ•˜๊ณ  ๋‹ค์šด๋กœ๋“œํ•˜์—ฌ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
- Tokenizer๋ฅผ ์‚ฌ์šฉํ•˜๋ ค๋ฉด `utils.py`์—์„œ `KoBERTTokenizer`๋ฅผ ์ž„ํฌํŠธํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค.
```python
from transformers import BertModel
from kobert_tokenizer import KoBERTTokenizer
def load_tokenizer(args):
bert_tokenizer = KoBERTTokenizer.from_pretrained(pretrained_model_name_or_path="skt/kobert-base-v1")
return bert_tokenizer
```
## Usage
```bash
$ python3 main.py --model_type kobert --do_train --do_eval
```
- `--write_pred` ์˜ต์…˜์„ ์ฃผ๋ฉด **evaluation์˜ prediction ๊ฒฐ๊ณผ**๊ฐ€ `preds` ํด๋”์— ์ €์žฅ๋ฉ๋‹ˆ๋‹ค.
## Prediction
```bash
$ python3 predict.py --input_file {INPUT_FILE_PATH} --output_file {OUTPUT_FILE_PATH} --model_dir {SAVED_CKPT_PATH}
```
## Results
| ๋ชจ๋ธ | Slot F1 (%) |
|---------------------------|-------------|
| KoBERT | 99.00 |
| DistilKoBERT | 90.00 |
| Bert-Multilingual | 99.00 |
## ๋ฐ์ดํ„ฐ ์„ค๋ช…
- **FOOD-B**: ์Œ์‹ ์‹œ์ž‘ ํƒœ๊ทธ
- **FOOD-I**: ์Œ์‹ ์•ˆ์— ์žˆ๋Š” ํƒœ๊ทธ
- **QTY-B**: ์ˆ˜๋Ÿ‰ ์‹œ์ž‘ ํƒœ๊ทธ
- **QTY-I**: ์ˆ˜๋Ÿ‰ ์•ˆ์— ์žˆ๋Š” ํƒœ๊ทธ
- **UNIT-B**: ๋‹จ์œ„ ์‹œ์ž‘ ํƒœ๊ทธ
### NER ์ž…๋ ฅ ์˜ˆ์‹œ
```
๋‚˜๋Š” ํ•œ์ž”์€ ์•„์ด์Šค ์•„๋ฉ”๋ฆฌ์นด๋…ธ๋ฅผ ๋งˆ์‹œ๊ณ  ๋””์ €ํŠธ๋Š” ๋งˆ์นด๋กฑ 3๊ฐœ๋ฅผ ๋จน์Œ.
```
### NER ์ถœ๋ ฅ ์˜ˆ์‹œ
```
๋‚˜๋Š” [ํ•œ:QTY-B] [์ž”:UNIT-B] ์€ [์•„์ด์Šค:FOOD-B] [์•„๋ฉ”๋ฆฌ์นด๋…ธ:FOOD-I] ๋งˆ์‹œ๊ณ  ๋””์ €ํŠธ๋Š” [๋งˆ์นด๋กฑ:FOOD-B] [3:QTY-B] [๊ฐœ:UNIT-B] ๋ฅผ ๋จน์Œ.
```
## References
- [Naver NLP Challenge](https://github.com/naver/nlp-challenge)
- [Huggingface Transformers](https://github.com/huggingface/transformers)