visolex
/

phobert-absa-smartphone

Text Classification

Model card Files Files and versions

AnnyNguyen commited on Jul 28, 2025

Commit

d5d7013

·

verified ·

1 Parent(s): 73f6b14

Create README.md

Files changed (1) hide show

README.md +88 -0

README.md ADDED Viewed

	@@ -0,0 +1,88 @@

+---
+license: apache-2.0
+datasets:
+- visolex/ViSFD
+language:
+- vi
+base_model:
+- vinai/phobert-base
+pipeline_tag: text-classification
+---
+Fine‑tuned from `vinai/phobert-base` on `visolex/phobert-absa-smartphone` for joint aspect detection + sentiment classification (shared heads).
+**Model Details**
+* **Base Model:** vinai/phobert-base
+* **Dataset:** visolex/ViSFD
+* **Fine‑tuning framework:** HuggingFace Transformers
+**Hyperparameters**
+* Batch size: 32
+* Learning rate: 3e‑5
+* Epochs: 100
+* Max sequence length: 256
+* Early stopping patience: 5
+**Usage**
+```python
+import torch
+from transformers import AutoTokenizer, AutoModel
+# Danh sách aspect và sentiment labels
+aspect_labels = [
+    "BATTERY", "CAMERA", "DESIGN", "FEATURES", "GENERAL",
+    "PERFORMANCE", "PRICE", "SCREEN", "SERandACC", "STORAGE"
+]
+sentiment_labels = ["POSITIVE", "NEGATIVE", "NEUTRAL"]
+# 1) Load tokenizer và model (phải về đúng class TransformerForABSA)
+repo = "visolex/phobert-absa-smartphone"
+tokenizer = AutoTokenizer.from_pretrained(repo, trust_remote_code=True)
+model     = AutoModel.from_pretrained(repo, trust_remote_code=True)
+model.eval()
+def predict_absa_multi(
+    text: str,
+    aspect_labels: list[str],
+    sentiment_labels: list[str],
+    threshold: float = 0.5
+) -> list[tuple[str,str]]:
+    inputs = tokenizer(
+        text,
+        return_tensors="pt",
+        padding=True,
+        truncation=True,
+        max_length=256
+    )
+    inputs.pop("token_type_ids", None)
+    with torch.no_grad():
+        out = model(**inputs)
+    # out.logits có shape [1, A, S+1]
+    logits = out.logits.squeeze(0)          # [A, S+1]
+    probs  = torch.softmax(logits, dim=-1)  # [A, S+1]
+    num_s  = len(sentiment_labels)
+    none_id = probs.size(-1) - 1            # chỉ số của lớp "none"
+    results = []
+    for i, asp in enumerate(aspect_labels):
+        prob_i = probs[i]
+        pred_id = int(prob_i.argmax().item())
+        if pred_id != none_id and pred_id < num_s:
+            score = prob_i[pred_id].item()
+            if score >= threshold:
+                results.append((asp, sentiment_labels[pred_id].lower()))
+    return results
+text = "mới mua được một tuần pin bốn nghìn mà quá tệ cảm ứng hơi đơ nhận sim bị lỗi."
+preds = predict_absa_multi(text, aspect_labels, sentiment_labels, threshold=0.2)
+print(preds)
+# ➔ [('BATTERY','negative'), ('PERFORMANCE','negative'), ...]
+```