Upload Phase 1 fine-tuned KcELECTRA Steam Aspect Classifier

Files changed (8) hide show

README.md ADDED Viewed

+# 🎮 KcELECTRA Steam Review Aspect Classifier (Phase 1)
+Fine-tuned model based on **beomi/KcELECTRA-base**
+for **Aspect-Based Sentiment Analysis (ABSA)** on Steam game reviews.
+## 📘 Model Info
+- Base model: `beomi/KcELECTRA-base`
+- Task: Multi-label classification (6 aspects)
+- Labels:
+  - STORY
+  - OPTIMIZATION
+  - GRAPHICS
+  - PRICE_VALUE
+  - BALANCE
+  - ENGAGEMENT
+## ⚙️ Training
+- Dataset: Custom labeled Steam reviews (2,349 samples)
+- Loss: BCEWithLogitsLoss
+- Epochs: 5
+- LR: 2e-5
+- Batch size: 16
+## 🧠 Usage Example
+```python
+from transformers import AutoTokenizer, AutoModelForSequenceClassification
+model = AutoModelForSequenceClassification.from_pretrained("Wing4/kcelectra-steam-aspect-classifier")
+tokenizer = AutoTokenizer.from_pretrained("Wing4/kcelectra-steam-aspect-classifier")
+inputs = tokenizer("그래픽은 좋지만 최적화가 별로야", return_tensors="pt")
+outputs = model(**inputs)
+print(outputs.logits.sigmoid())  # 각 측면별 확률

config.json ADDED Viewed

+{
+    "model_type": "electra",
+    "architectures": [
+        "AspectClassifier"
+    ],
+    "hidden_size": 768,
+    "num_labels": 6,
+    "problem_type": "multi_label_classification",
+    "base_model_name": "beomi/KcELECTRA-base",
+    "id2label": {
+        "0": "STORY",
+        "1": "OPTIMIZATION",
+        "2": "GRAPHICS",
+        "3": "PRICE_VALUE",
+        "4": "BALANCE",
+        "5": "ENGAGEMENT"
+    },
+    "label2id": {
+        "STORY": 0,
+        "OPTIMIZATION": 1,
+        "GRAPHICS": 2,
+        "PRICE_VALUE": 3,
+        "BALANCE": 4,
+        "ENGAGEMENT": 5
+    }
+}

phase1_aspect_classifier.pt ADDED Viewed

+version https://git-lfs.github.com/spec/v1
+oid sha256:f87fb67e775644c306fa43b12a2b039f5ee390fcd7e5c0142f562fbe242dfce7
+size 434067534

phase2_sentiment_classifier.pt ADDED Viewed

+version https://git-lfs.github.com/spec/v1
+oid sha256:f9683dfbcec037ac9e1e0a708a53f3da0d99fc584e77cbb38ac6a930502242b8
+size 434086645

pytorch_model.bin ADDED Viewed

+version https://git-lfs.github.com/spec/v1
+oid sha256:6ff4aada1398c5a2e7c54e8c6f553f00547713e4ecfd75c3e3a166854167b607
+size 434062463

special_tokens_map.json ADDED Viewed

+{
+  "cls_token": {
+    "content": "[CLS]",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  },
+  "mask_token": {
+    "content": "[MASK]",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  },
+  "pad_token": {
+    "content": "[PAD]",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  },
+  "sep_token": {
+    "content": "[SEP]",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  },
+  "unk_token": {
+    "content": "[UNK]",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  }
+}

tokenizer.json ADDED Viewed

The diff for this file is too large to render. See raw diff

tokenizer_config.json ADDED Viewed

+{
+  "added_tokens_decoder": {
+    "0": {
+      "content": "[UNK]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "1": {
+      "content": "[CLS]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "2": {
+      "content": "[SEP]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "3": {
+      "content": "[PAD]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "4": {
+      "content": "[MASK]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    }
+  },
+  "clean_up_tokenization_spaces": false,
+  "cls_token": "[CLS]",
+  "do_basic_tokenize": true,
+  "do_lower_case": false,
+  "extra_special_tokens": {},
+  "mask_token": "[MASK]",
+  "model_max_length": 512,
+  "never_split": null,
+  "pad_token": "[PAD]",
+  "sep_token": "[SEP]",
+  "strip_accents": null,
+  "tokenize_chinese_chars": true,
+  "tokenizer_class": "PreTrainedTokenizerFast",
+  "unk_token": "[UNK]"
+}