bartpho-absa-hotel / README.md
AnnyNguyen's picture
Upload README.md with huggingface_hub
fec1719 verified
---
license: apache-2.0
base_model: bartpho
tags:
- vietnamese
- aspect-based-sentiment-analysis
- VLSP-ABSA
datasets:
- visolex/VLSP2018-ABSA-Hotel
metrics:
- accuracy
- macro-f1
model-index:
- name: bartpho-absa-hotel
results:
- task:
type: text-classification
name: Aspect-based Sentiment Analysis
dataset:
name: VLSP2018-ABSA-Hotel
type: VLSP-ABSA
metrics:
- type: accuracy
value: 0.9016
- type: macro-f1
value: 0.1161
- type: macro_precision
value: 0.2827
- type: macro_recall
value: 0.0730
---
# bartpho-absa-hotel: Aspect-based Sentiment Analysis for Vietnamese Reviews
This model is a fine-tuned version of [bartpho](https://huggingface.co/bartpho)
on the **VLSP2018-ABSA-Hotel** dataset for aspect-based sentiment analysis in Vietnamese reviews.
## Model Details
* **Base Model**: bartpho
* **Description**: BartPho for Vietnamese ABSA
* **Dataset**: VLSP2018-ABSA-Hotel
* **Fine-tuning Framework**: HuggingFace Transformers
* **Task**: Aspect-based Sentiment Classification (3 classes)
### Hyperparameters
* Batch size: `32`
* Learning rate: `3e-5`
* Epochs: `100`
* Max sequence length: `256`
* Weight decay: `0.01`
* Warmup steps: `500`
* Optimizer: AdamW
## Dataset
Model was trained on **VLSP2018 ABSA Hotel dataset** for aspect-based sentiment analysis.
### Sentiment Labels:
* **0 - Negative** (Tiêu cực): Negative opinions
* **1 - Neutral** (Trung lập): Neutral, objective opinions
* **2 - Positive** (Tích cực): Positive opinions
### Aspect Categories:
Model được train để phân tích sentiment cho các aspects sau:
- **FACILITIES#CLEANLINESS**
- **FACILITIES#COMFORT**
- **FACILITIES#DESIGN&FEATURES**
- **FACILITIES#GENERAL**
- **FACILITIES#MISCELLANEOUS**
- **FACILITIES#PRICES**
- **FACILITIES#QUALITY**
- **FOOD&DRINKS#MISCELLANEOUS**
- **FOOD&DRINKS#PRICES**
- **FOOD&DRINKS#QUALITY**
- **FOOD&DRINKS#STYLE&OPTIONS**
- **HOTEL#CLEANLINESS**
- **HOTEL#COMFORT**
- **HOTEL#DESIGN&FEATURES**
- **HOTEL#GENERAL**
- **HOTEL#MISCELLANEOUS**
- **HOTEL#PRICES**
- **HOTEL#QUALITY**
- **LOCATION#GENERAL**
- **ROOMS#CLEANLINESS**
- **ROOMS#COMFORT**
- **ROOMS#DESIGN&FEATURES**
- **ROOMS#GENERAL**
- **ROOMS#MISCELLANEOUS**
- **ROOMS#PRICES**
- **ROOMS#QUALITY**
- **ROOM_AMENITIES#CLEANLINESS**
- **ROOM_AMENITIES#COMFORT**
- **ROOM_AMENITIES#DESIGN&FEATURES**
- **ROOM_AMENITIES#GENERAL**
- **ROOM_AMENITIES#MISCELLANEOUS**
- **ROOM_AMENITIES#PRICES**
- **ROOM_AMENITIES#QUALITY**
- **SERVICE#GENERAL**
## Evaluation Results
The model was evaluated on test set with the following metrics:
* **Accuracy**: `0.9016`
* **Macro-F1**: `0.1161`
* **Weighted-F1**: `0.2486`
* **Macro-Precision**: `0.2827`
* **Macro-Recall**: `0.0730`
## Usage Example
```python
import torch
from transformers import AutoTokenizer, AutoModel
# Load model and tokenizer
repo = "visolex/bartpho-absa-hotel"
tokenizer = AutoTokenizer.from_pretrained(repo, trust_remote_code=True)
model = AutoModel.from_pretrained(repo, trust_remote_code=True)
model.eval()
# Aspect labels for VLSP2018-ABSA-Hotel
aspect_labels = [
"FACILITIES#CLEANLINESS",
"FACILITIES#COMFORT",
"FACILITIES#DESIGN&FEATURES",
"FACILITIES#GENERAL",
"FACILITIES#MISCELLANEOUS",
"FACILITIES#PRICES",
"FACILITIES#QUALITY",
"FOOD&DRINKS#MISCELLANEOUS",
"FOOD&DRINKS#PRICES",
"FOOD&DRINKS#QUALITY",
"FOOD&DRINKS#STYLE&OPTIONS",
"HOTEL#CLEANLINESS",
"HOTEL#COMFORT",
"HOTEL#DESIGN&FEATURES",
"HOTEL#GENERAL",
"HOTEL#MISCELLANEOUS",
"HOTEL#PRICES",
"HOTEL#QUALITY",
"LOCATION#GENERAL",
"ROOMS#CLEANLINESS",
"ROOMS#COMFORT",
"ROOMS#DESIGN&FEATURES",
"ROOMS#GENERAL",
"ROOMS#MISCELLANEOUS",
"ROOMS#PRICES",
"ROOMS#QUALITY",
"ROOM_AMENITIES#CLEANLINESS",
"ROOM_AMENITIES#COMFORT",
"ROOM_AMENITIES#DESIGN&FEATURES",
"ROOM_AMENITIES#GENERAL",
"ROOM_AMENITIES#MISCELLANEOUS",
"ROOM_AMENITIES#PRICES",
"ROOM_AMENITIES#QUALITY",
"SERVICE#GENERAL"
]
# Sentiment labels
sentiment_labels = ["POSITIVE", "NEGATIVE", "NEUTRAL"]
# Example review text
text = "Khách sạn rất sạch sẽ, phòng ốc thoải mái nhưng giá hơi cao."
# Tokenize
inputs = tokenizer(
text,
return_tensors="pt",
padding=True,
truncation=True,
max_length=256
)
inputs.pop("token_type_ids", None)
# Predict
with torch.no_grad():
outputs = model(**inputs)
# Get logits: shape [1, num_aspects, num_sentiments + 1]
logits = outputs.logits.squeeze(0) # [num_aspects, num_sentiments + 1]
probs = torch.softmax(logits, dim=-1)
# Predict for each aspect
none_id = probs.size(-1) - 1 # Index of "none" class
results = []
for i, aspect in enumerate(aspect_labels):
prob_i = probs[i]
pred_id = int(prob_i.argmax().item())
if pred_id != none_id and pred_id < len(sentiment_labels):
score = prob_i[pred_id].item()
if score >= 0.5: # threshold
results.append((aspect, sentiment_labels[pred_id].lower()))
print(f"Text: {text}")
print(f"Predicted aspects: {results}")
# Output example: [('aspects', 'positive'), ('aspects', 'positive'), ('aspects', 'negative')]
```
## Citation
If you use this model, please cite:
```bibtex
@misc{visolex_absa_bartpho_absa_hotel,
title={BartPho for Vietnamese ABSA for Vietnamese Aspect-based Sentiment Analysis},
author={ViSoLex Team},
year={2025},
url={https://huggingface.co/visolex/bartpho-absa-hotel}
}
```
## License
This model is released under the Apache-2.0 license.
## Acknowledgments
* Base model: [bartpho](https://huggingface.co/bartpho)
* Dataset: VLSP2018-ABSA-Hotel
* ViSoLex Toolkit
---