conviette
/

korPolBERT

Text Classification

Model card Files Files and versions

conviette commited on Apr 15, 2022

Commit

e9213fc

·

1 Parent(s): afc7f7c

Update README.md

Files changed (1) hide show

README.md +29 -0

README.md CHANGED Viewed

@@ -12,6 +12,35 @@ For further details, refer to our paper on Journalism: [News comment sections an
 * This model is a finetuned model based on ETRI\'s KorBERT.
 ### Model performance
 * Accuracy: 0.8322

 * This model is a finetuned model based on ETRI\'s KorBERT.
+### How to use
+* The model requires an edited version of the transformers class `BertTokenizer`, which can be found in the file `KorBertTokenizer.py`.
+* Usage example:
+~~~python
+from KorBertTokenizer import KorBertTokenizer
+from transformers import BertForSequenceClassification
+import torch
+tokenizer = KorBertTokenizer.from_pretrained('model\\')
+model = BertForSequenceClassification.from_pretrained('model\\')
+def classify(text):
+    inputs = tokenizer(text, padding='max_length', max_length=70, return_tensors='pt')
+    with torch.no_grad():
+        logits=model(**inputs).logits
+        predicted_class_id = logits.argmax().item()
+        return model.config.id2label[predicted_class_id]
+input_strings = ['좌파가 나라 경제 안보 말아먹는다',
+                 '수꼴들은 나라 일본한테 팔아먹었냐']
+for input_string in input_strings:
+    print('===\n입력 텍스트: {}\n분류 결과: {}\n==='.format(input_string, classify(input_string)))
+~~~
 ### Model performance
 * Accuracy: 0.8322