emotion_classifier / README.md
noridorimari's picture
Update README.md
9a7753c verified
---
language:
- ko
base_model:
- beomi/KcELECTRA-base
pipeline_tag: text-classification
tags:
- emotion
- sentiment
---
## Text์˜ ๊ฐ์ •์„ ๋ถ„์„ํ•˜๊ธฐ ์œ„ํ•œ ๋ชจ๋ธ์ž…๋‹ˆ๋‹ค.
## KcELECTRA ๋ชจ๋ธ์„ ๊ธฐ๋ฐ˜์œผ๋กœ ์•ฝ 22๋งŒ๊ฐœ์˜ ๊ฐ์ • ๋ฌธ์žฅ์„ ํ•™์Šตํ•˜์˜€์Šต๋‹ˆ๋‹ค.
### ๊ฐ์ •์€ ์ด 6๊ฐœ์˜ ์นดํ…Œ๊ณ ๋ฆฌ๋กœ ๋„์ถœ๋˜๋ฉฐ ๊ธฐ์จ, ๋‹นํ™ฉ, ๋ถ„๋…ธ, ๋ถˆ์•ˆ, ์ƒ์ฒ˜, ์Šฌํ”” ์ž…๋‹ˆ๋‹ค.
```
import torch
from transformers import AutoTokenizer, AutoModelForSequenceClassification, pipeline
model_name = "noridorimari/emotion_classifier"
# ๊ฐ์ • ๋ผ๋ฒจ ๋งคํ•‘
id2label = {
0: "๊ธฐ์จ", # happy
1: "๋‹นํ™ฉ", # embarrass
2: "๋ถ„๋…ธ", # anger
3: "๋ถˆ์•ˆ", # unrest
4: "์ƒ์ฒ˜", # damaged
5: "์Šฌํ””" # sadness
}
label2id = {v: k for k, v in id2label.items()}
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSequenceClassification.from_pretrained(model_name)
# id2label ์ •๋ณด๋ฅผ ์ง์ ‘ ์„ค์ • (config์— ์ถ”๊ฐ€)
model.config.id2label = id2label
model.config.label2id = label2id
classifier = pipeline(
"text-classification",
model=model,
tokenizer=tokenizer,
return_all_scores=True,
device=0 if torch.cuda.is_available() else -1
)
texts = [
"์˜ค๋Š˜ ํšŒ์‚ฌ์—์„œ ์‹ค์ˆ˜ํ•ด์„œ ๋„ˆ๋ฌด ๋ถˆ์•ˆํ•ด.",
"์นœ๊ตฌ๊ฐ€ ๋‚˜ํ•œํ…Œ ๊ฑฐ์ง“๋งํ•ด์„œ ์ •๋ง ํ™”๊ฐ€ ๋‚ฌ์–ด.",
"์ข‹์€ ์†Œ์‹์ด ์žˆ์–ด์„œ ํ•˜๋ฃจ ์ข…์ผ ๊ธฐ๋ถ„์ด ์ข‹์•„!"
]
for text in texts:
preds = classifier(text)[0]
# ํ™•๋ฅ  ๋†’์€ ์ˆœ์œผ๋กœ ์ •๋ ฌ
preds = sorted(preds, key=lambda x: x["score"], reverse=True)
top = preds[0]
print(f"\n๋ฌธ์žฅ: {text}")
print(f"์˜ˆ์ธก ๊ฐ์ •: {top['label']} ({top['score']*100:.2f}%)")
print("์ƒ์„ธ ํ™•๋ฅ  ๋ถ„ํฌ:")
for p in preds:
print(f" {p['label']:>4} : {p['score']*100:.2f}%")
```
```
@misc{lee2021kcelectra,
author = {Junbum Lee},
title = {KcELECTRA: Korean comments ELECTRA},
year = {2021},
publisher = {GitHub},
journal = {GitHub repository},
howpublished = {\url{https://github.com/Beomi/KcELECTRA}}
}
```