Update README.md
Browse files
README.md
CHANGED
|
@@ -12,6 +12,35 @@ For further details, refer to our paper on Journalism: [News comment sections an
|
|
| 12 |
* This model is a finetuned model based on ETRI\'s KorBERT.
|
| 13 |
|
| 14 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 15 |
### Model performance
|
| 16 |
|
| 17 |
* Accuracy: 0.8322
|
|
|
|
| 12 |
* This model is a finetuned model based on ETRI\'s KorBERT.
|
| 13 |
|
| 14 |
|
| 15 |
+
### How to use
|
| 16 |
+
* The model requires an edited version of the transformers class `BertTokenizer`, which can be found in the file `KorBertTokenizer.py`.
|
| 17 |
+
* Usage example:
|
| 18 |
+
|
| 19 |
+
~~~python
|
| 20 |
+
from KorBertTokenizer import KorBertTokenizer
|
| 21 |
+
from transformers import BertForSequenceClassification
|
| 22 |
+
import torch
|
| 23 |
+
|
| 24 |
+
tokenizer = KorBertTokenizer.from_pretrained('model\\')
|
| 25 |
+
model = BertForSequenceClassification.from_pretrained('model\\')
|
| 26 |
+
|
| 27 |
+
def classify(text):
|
| 28 |
+
inputs = tokenizer(text, padding='max_length', max_length=70, return_tensors='pt')
|
| 29 |
+
|
| 30 |
+
with torch.no_grad():
|
| 31 |
+
logits=model(**inputs).logits
|
| 32 |
+
predicted_class_id = logits.argmax().item()
|
| 33 |
+
return model.config.id2label[predicted_class_id]
|
| 34 |
+
|
| 35 |
+
|
| 36 |
+
input_strings = ['์ขํ๊ฐ ๋๋ผ ๊ฒฝ์ ์๋ณด ๋ง์๋จน๋๋ค',
|
| 37 |
+
'์๊ผด๋ค์ ๋๋ผ ์ผ๋ณธํํ
ํ์๋จน์๋']
|
| 38 |
+
|
| 39 |
+
for input_string in input_strings:
|
| 40 |
+
print('===\n์
๋ ฅ ํ
์คํธ: {}\n๋ถ๋ฅ ๊ฒฐ๊ณผ: {}\n==='.format(input_string, classify(input_string)))
|
| 41 |
+
~~~
|
| 42 |
+
|
| 43 |
+
|
| 44 |
### Model performance
|
| 45 |
|
| 46 |
* Accuracy: 0.8322
|