gaaahee
/

stance-classifier-v2

@@ -2,49 +2,125 @@
 language: ko
 license: mit
 tags:
-- kobert
-- stance-detection
-- korean
-- text-classification
 metrics:
-- accuracy
-- f1
 ---
-# KoBERT Stance Classifier v2
-한국어 정치 뉴스 스탠스 분류 모델
 ## Performance
 | Metric | Score |
 |--------|-------|
-| Test Accuracy | 73.37% |
-| Test F1 (macro) | 0.7336 |
 ## Labels
-- 0: 옹호 (Support)
-- 1: 중립 (Neutral)
-- 2: 비판 (Oppose)
 ## Usage
 ```python
 import torch
-from transformers import AutoTokenizer
 tokenizer = AutoTokenizer.from_pretrained('monologg/kobert', trust_remote_code=True)
-# Load model
-checkpoint = torch.load('pytorch_model.pt', map_location='cpu')
-# model.load_state_dict(checkpoint['model_state_dict'])
 ```
-## Training Config
-- Base Model: monologg/kobert
-- Max Length: 256
-- Batch Size: 64
-- Learning Rate: 2e-05
-- Focal Loss: True

 language: ko
 license: mit
 tags:
+  - pytorch
+  - bert
+  - kobert
+  - text-classification
+  - stance-detection
+  - korean
+  - news
+  - political
+datasets:
+  - custom
 metrics:
+  - accuracy
+  - f1
+model-index:
+  - name: stance-classifier-v2
+    results:
+      - task:
+          type: text-classification
+          name: Stance Classification
+        metrics:
+          - type: accuracy
+            value: 73.93
+            name: Test Accuracy
+          - type: f1
+            value: 0.7395
+            name: Test F1
 ---
+# Korean Political News Stance Classifier v2
+KoBERT 기반 한국어 정치 뉴스 스탠스(입장) 분류 모델입니다.
+## Model Description
+- **Base Model**: monologg/kobert
+- **Task**: 3-class stance classification (옹호/중립/비판)
+- **Language**: Korean
+- **Training Data**: ~12,000 labeled political news articles
 ## Performance
 | Metric | Score |
 |--------|-------|
+| Test Accuracy | 73.93% |
+| Test F1 (macro) | 0.7395 |
 ## Labels
+| Label ID | Korean | English | Description |
+|----------|--------|---------|-------------|
+| 0 | 옹호 | support | 정부/여당에 우호적 |
+| 1 | 중립 | neutral | 객관적 사실 전달 |
+| 2 | 비판 | oppose | 정부/여당에 비판적 |
 ## Usage
 ```python
 import torch
+from transformers import BertModel, AutoTokenizer
+from huggingface_hub import hf_hub_download
+import torch.nn as nn
+# 모델 정의
+class StanceClassifier(nn.Module):
+    def __init__(self, bert_model, num_classes=3, dropout_rate=0.3):
+        super().__init__()
+        self.bert = bert_model
+        self.dropout = nn.Dropout(dropout_rate)
+        self.classifier = nn.Linear(768, num_classes)
+    def forward(self, input_ids, attention_mask, token_type_ids=None):
+        outputs = self.bert(input_ids=input_ids, attention_mask=attention_mask, token_type_ids=token_type_ids)
+        pooled_output = outputs.pooler_output
+        pooled_output = self.dropout(pooled_output)
+        return self.classifier(pooled_output)
+# 모델 로드
+model_path = hf_hub_download(repo_id="gaaahee/stance-classifier-v2", filename="pytorch_model.pt")
+checkpoint = torch.load(model_path, map_location='cpu')
+bert_model = BertModel.from_pretrained('monologg/kobert')
+model = StanceClassifier(bert_model)
+model.load_state_dict(checkpoint['model_state_dict'])
+model.eval()
+# 토크나이저 로드
 tokenizer = AutoTokenizer.from_pretrained('monologg/kobert', trust_remote_code=True)
+# 예측
+text = "정부의 새 정책이 경제 성장에 크게 기여할 것으로 기대된다"
+encoding = tokenizer(text, truncation=True, max_length=512, padding='max_length', return_tensors='pt')
+with torch.no_grad():
+    logits = model(encoding['input_ids'], encoding['attention_mask'])
+    probs = torch.softmax(logits, dim=1)
+    pred = torch.argmax(probs, dim=1).item()
+labels = ['옹호', '중립', '비판']
+print(f"Prediction: {labels[pred]} ({probs[0][pred].item()*100:.1f}%)")
 ```
+## Training Details
+| Parameter | Value |
+|-----------|-------|
+| Base Model | monologg/kobert |
+| Max Length | 512 |
+| Batch Size | 64 |
+| Learning Rate | 2e-05 |
+| Dropout | 0.3 |
+| Loss Function | Focal Loss (gamma=2.0) |
+| Early Stopping | patience=3 |
+## Citation
+```bibtex
+@misc{korean-stance-classifier-v2,
+  title={Korean Political News Stance Classifier v2},
+  year={2024},
+  publisher={HuggingFace}
+}
+```

config.json CHANGED Viewed

@@ -1,22 +1,31 @@
 {
-  "model_type": "bert",
   "base_model": "monologg/kobert",
   "num_labels": 3,
-  "labels": {
-    "0": "옹호",
-    "1": "중립",
-    "2": "비판"
   },
-  "max_length": 256,
-  "dropout": 0.3,
-  "metrics": {
-    "test_accuracy": 0.7336561743341404,
-    "test_f1_macro": 0.7335882497408175,
-    "best_val_f1": 0.7595240086537992
   },
   "training_config": {
     "model_name": "monologg/kobert",
-    "max_length": 256,
     "dropout": 0.3,
     "batch_size": 64,
     "epochs": 10,

 {
+  "model_type": "kobert-stance-classifier",
   "base_model": "monologg/kobert",
+  "tokenizer": "monologg/kobert",
   "num_labels": 3,
+  "label2id": {
+    "support": 0,
+    "neutral": 1,
+    "oppose": 2
   },
+  "id2label": {
+    "0": "support",
+    "1": "neutral",
+    "2": "oppose"
   },
+  "label_names_kr": [
+    "옹호",
+    "중립",
+    "비판"
+  ],
+  "max_length": 512,
+  "dropout": 0.3,
+  "hidden_size": 768,
+  "test_accuracy": 0.7393058918482648,
+  "test_f1": 0.7394790179274192,
   "training_config": {
     "model_name": "monologg/kobert",
+    "max_length": 512,
     "dropout": 0.3,
     "batch_size": 64,
     "epochs": 10,

pytorch_model.pt CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:1deb5151cb360cc5c01534c987bbfdd69a74336d619aae17668b999cb35525e7
 size 368845427

 version https://git-lfs.github.com/spec/v1
+oid sha256:41c57deab19126e146c4e2a51bde628e65e31cf2952159bb8bb375bdead00fa2
 size 368845427