visolex
/

mbert-absa-hotel

+---
+license: apache-2.0
+base_model: mbert
+tags:
+- vietnamese
+- aspect-based-sentiment-analysis
+- VLSP-ABSA
+datasets:
+- visolex/VLSP2018-ABSA-Hotel
+metrics:
+- accuracy
+- macro-f1
+model-index:
+- name: mbert-absa-hotel
+  results:
+  - task:
+      type: text-classification
+      name: Aspect-based Sentiment Analysis
+    dataset:
+      name: VLSP2018-ABSA-Hotel
+      type: VLSP-ABSA
+    metrics:
+    - type: accuracy
+      value: 0.9524
+    - type: macro-f1
+      value: 0.5098
+    - type: macro_precision
+      value: 0.7107
+    - type: macro_recall
+      value: 0.4263
+---
+# mbert-absa-hotel: Aspect-based Sentiment Analysis for Vietnamese Reviews
+This model is a fine-tuned version of [mbert](https://huggingface.co/mbert)
+on the **VLSP2018-ABSA-Hotel** dataset for aspect-based sentiment analysis in Vietnamese reviews.
+## Model Details
+* **Base Model**: mbert
+* **Description**: mBERT for Vietnamese ABSA
+* **Dataset**: VLSP2018-ABSA-Hotel
+* **Fine-tuning Framework**: HuggingFace Transformers
+* **Task**: Aspect-based Sentiment Classification (3 classes)
+### Hyperparameters
+* Batch size: `32`
+* Learning rate: `3e-5`
+* Epochs: `100`
+* Max sequence length: `256`
+* Weight decay: `0.01`
+* Warmup steps: `500`
+* Optimizer: AdamW
+## Dataset
+Model was trained on **VLSP2018 ABSA Hotel dataset** for aspect-based sentiment analysis.
+### Sentiment Labels:
+* **0 - Negative** (Tiêu cực): Negative opinions
+* **1 - Neutral** (Trung lập): Neutral, objective opinions
+* **2 - Positive** (Tích cực): Positive opinions
+### Aspect Categories:
+Model được train để phân tích sentiment cho các aspects sau:
+- **FACILITIES#CLEANLINESS**
+- **FACILITIES#COMFORT**
+- **FACILITIES#DESIGN&FEATURES**
+- **FACILITIES#GENERAL**
+- **FACILITIES#MISCELLANEOUS**
+- **FACILITIES#PRICES**
+- **FACILITIES#QUALITY**
+- **FOOD&DRINKS#MISCELLANEOUS**
+- **FOOD&DRINKS#PRICES**
+- **FOOD&DRINKS#QUALITY**
+- **FOOD&DRINKS#STYLE&OPTIONS**
+- **HOTEL#CLEANLINESS**
+- **HOTEL#COMFORT**
+- **HOTEL#DESIGN&FEATURES**
+- **HOTEL#GENERAL**
+- **HOTEL#MISCELLANEOUS**
+- **HOTEL#PRICES**
+- **HOTEL#QUALITY**
+- **LOCATION#GENERAL**
+- **ROOMS#CLEANLINESS**
+- **ROOMS#COMFORT**
+- **ROOMS#DESIGN&FEATURES**
+- **ROOMS#GENERAL**
+- **ROOMS#MISCELLANEOUS**
+- **ROOMS#PRICES**
+- **ROOMS#QUALITY**
+- **ROOM_AMENITIES#CLEANLINESS**
+- **ROOM_AMENITIES#COMFORT**
+- **ROOM_AMENITIES#DESIGN&FEATURES**
+- **ROOM_AMENITIES#GENERAL**
+- **ROOM_AMENITIES#MISCELLANEOUS**
+- **ROOM_AMENITIES#PRICES**
+- **ROOM_AMENITIES#QUALITY**
+- **SERVICE#GENERAL**
+## Evaluation Results
+The model was evaluated on test set with the following metrics:
+* **Accuracy**: `0.9524`
+* **Macro-F1**: `0.5098`
+* **Weighted-F1**: `0.7670`
+* **Macro-Precision**: `0.7107`
+* **Macro-Recall**: `0.4263`
+## Usage Example
+```python
+import torch
+from transformers import AutoTokenizer, AutoModel
+# Load model and tokenizer
+repo = "visolex/mbert-absa-hotel"
+tokenizer = AutoTokenizer.from_pretrained(repo, trust_remote_code=True)
+model = AutoModel.from_pretrained(repo, trust_remote_code=True)
+model.eval()
+# Aspect labels for VLSP2018-ABSA-Hotel
+aspect_labels = [
+    "FACILITIES#CLEANLINESS",
+    "FACILITIES#COMFORT",
+    "FACILITIES#DESIGN&FEATURES",
+    "FACILITIES#GENERAL",
+    "FACILITIES#MISCELLANEOUS",
+    "FACILITIES#PRICES",
+    "FACILITIES#QUALITY",
+    "FOOD&DRINKS#MISCELLANEOUS",
+    "FOOD&DRINKS#PRICES",
+    "FOOD&DRINKS#QUALITY",
+    "FOOD&DRINKS#STYLE&OPTIONS",
+    "HOTEL#CLEANLINESS",
+    "HOTEL#COMFORT",
+    "HOTEL#DESIGN&FEATURES",
+    "HOTEL#GENERAL",
+    "HOTEL#MISCELLANEOUS",
+    "HOTEL#PRICES",
+    "HOTEL#QUALITY",
+    "LOCATION#GENERAL",
+    "ROOMS#CLEANLINESS",
+    "ROOMS#COMFORT",
+    "ROOMS#DESIGN&FEATURES",
+    "ROOMS#GENERAL",
+    "ROOMS#MISCELLANEOUS",
+    "ROOMS#PRICES",
+    "ROOMS#QUALITY",
+    "ROOM_AMENITIES#CLEANLINESS",
+    "ROOM_AMENITIES#COMFORT",
+    "ROOM_AMENITIES#DESIGN&FEATURES",
+    "ROOM_AMENITIES#GENERAL",
+    "ROOM_AMENITIES#MISCELLANEOUS",
+    "ROOM_AMENITIES#PRICES",
+    "ROOM_AMENITIES#QUALITY",
+    "SERVICE#GENERAL"
+]
+# Sentiment labels
+sentiment_labels = ["POSITIVE", "NEGATIVE", "NEUTRAL"]
+# Example review text
+text = "Khách sạn rất sạch sẽ, phòng ốc thoải mái nhưng giá hơi cao."
+# Tokenize
+inputs = tokenizer(
+    text,
+    return_tensors="pt",
+    padding=True,
+    truncation=True,
+    max_length=256
+)
+inputs.pop("token_type_ids", None)
+# Predict
+with torch.no_grad():
+    outputs = model(**inputs)
+# Get logits: shape [1, num_aspects, num_sentiments + 1]
+logits = outputs.logits.squeeze(0)  # [num_aspects, num_sentiments + 1]
+probs = torch.softmax(logits, dim=-1)
+# Predict for each aspect
+none_id = probs.size(-1) - 1  # Index of "none" class
+results = []
+for i, aspect in enumerate(aspect_labels):
+    prob_i = probs[i]
+    pred_id = int(prob_i.argmax().item())
+    if pred_id != none_id and pred_id < len(sentiment_labels):
+        score = prob_i[pred_id].item()
+        if score >= 0.5:  # threshold
+            results.append((aspect, sentiment_labels[pred_id].lower()))
+print(f"Text: {text}")
+print(f"Predicted aspects: {results}")
+# Output example: [('aspects', 'positive'), ('aspects', 'positive'), ('aspects', 'negative')]
+```
+## Citation
+If you use this model, please cite:
+```bibtex
+@misc{visolex_absa_mbert_absa_hotel,
+  title={mBERT for Vietnamese ABSA for Vietnamese Aspect-based Sentiment Analysis},
+  author={ViSoLex Team},
+  year={2025},
+  url={https://huggingface.co/visolex/mbert-absa-hotel}
+}
+```
+## License
+This model is released under the Apache-2.0 license.
+## Acknowledgments
+* Base model: [mbert](https://huggingface.co/mbert)
+* Dataset: VLSP2018-ABSA-Hotel
+* ViSoLex Toolkit
+---