PhayaThaiBERT for Thai Restaurant ABSA

Fine-tuned PhayaThaiBERT for Aspect-Based Sentiment Analysis (ABSA) on Thai restaurant reviews.

Model Description

This model classifies sentiment across 10 aspects simultaneously from a single Thai restaurant review. It was fine-tuned as part of a Computer Science Senior Project at Kasetsart University (2025).

Aspects & Labels

10 Aspects: taste · food_quality · portion · atmosphere · cleanliness · location_parking · staff · speed · price · value

4 Sentiment Labels: positive · neutral · negative · not_available

not_available means the aspect was not mentioned in the review.

Training Details


Base Model	clicknext/phayathaibert
Dataset	Wongnai Restaurant Review Dataset (19,938 reviews)
Labeling	LLM-Assisted Labeling via gpt-4.1-mini
Architecture	Transformer Encoder → Mean Pooling → 10 × Linear(768→4)
Optimizer	AdamW · Weight Decay = 0.01
Learning Rate	3.85e-05
Warmup Ratio	0.287
Dropout	0.374
Batch Size	8
Max Epochs	10 (No Early Stopping triggered)
Best Epoch	10
Loss	Weighted Cross-Entropy · Class Weight Cap = 15.0
Precision	Mixed Precision (FP16)

Performance (Test Set)

Aspect	Macro-F1
taste	0.7342
food_quality	0.6510
portion	0.6584
atmosphere	0.7099
cleanliness	0.7426
location_parking	0.7119
staff	0.7387
speed	0.7348
price	0.7328
value	0.6254
Overall	0.7040

Usage

import torch
import torch.nn as nn
from transformers import AutoModel, CamembertTokenizer
import huggingface_hub

ASPECTS = ['taste', 'food_quality', 'portion', 'atmosphere', 'cleanliness',
           'location_parking', 'staff', 'speed', 'price', 'value']
LABELS  = ['positive', 'neutral', 'negative', 'not_available']

class ABSAModel(nn.Module):
    def __init__(self, model_name, n_aspects=10, n_labels=4):
        super().__init__()
        self.encoder = AutoModel.from_pretrained(model_name)
        self.heads = nn.ModuleList([
            nn.Linear(768, n_labels) for _ in range(n_aspects)
        ])

    def forward(self, input_ids, attention_mask):
        out = self.encoder(input_ids=input_ids, attention_mask=attention_mask)
        token_emb = out.last_hidden_state
        mask_exp  = attention_mask.unsqueeze(-1).expand(token_emb.size()).float()
        pooled    = torch.sum(token_emb * mask_exp, 1) / torch.clamp(mask_exp.sum(1), min=1e-9)
        return [head(pooled) for head in self.heads]

# Load model
MODEL_NAME = 'clicknext/phayathaibert'
tokenizer  = CamembertTokenizer.from_pretrained(MODEL_NAME)
model      = ABSAModel(MODEL_NAME)

heads_path = huggingface_hub.hf_hub_download(
    repo_id='Preangkwan/phayathaibert-absa-restaurant_2',
    filename='classifier_heads.pt'
)
state = torch.load(heads_path, map_location='cpu')
# remap keys as needed
model.eval()

# Inference
def predict(text):
    enc = tokenizer(text, return_tensors='pt', max_length=512,
                    truncation=True, padding=True)
    with torch.no_grad():
        logits = model(enc['input_ids'], enc['attention_mask'])
    results = {}
    for i, asp in enumerate(ASPECTS):
        pred_idx = torch.argmax(logits[i], dim=-1).item()
        label    = LABELS[pred_idx]
        if label != 'not_available':
            results[asp] = label
    return results

# Example
text = 'อาหารอร่อยมาก รสชาติดี แต่บริการช้าและที่จอดรถไม่มีเลย ราคาโอเคครับ'
print(predict(text))
# {'taste': 'positive', 'location_parking': 'negative',
#  'staff': 'negative', 'speed': 'negative',
#  'price': 'neutral', 'value': 'neutral'}

Citation

If you use this model, please cite:

Khowinthawong, J. & Chatchadanukul, P. (2025).
Aspect-Based Sentiment Analysis of Thai Restaurant Reviews
Using Transformer-Based Models.
Computer Science Senior Project, Kasetsart University.

Related Model

WangchanBERTa version — Overall Macro-F1: 0.6971

Authors

Jirapat Khowinthawong
Preangkwan Chatchadanukul

Kasetsart University · Computer Science · 2025

Downloads last month: 5

Safetensors

Model size

0.3B params

Tensor type

F32

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support