You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Log in or Sign Up to review the conditions and access this model content.

m-e5-small-hosrev

Overview

Vietnamese aspect-category sentiment classification model for hospital reviews. The model predicts sentiment for 13 aspect categories covering facilities, medical staff, and overall experience.

Model Details

  • Base model: intfloat/multilingual-e5-small
  • Architecture: absa
  • Checkpoint source: hosrev-e5-small-best.pt
  • Sequence length used during training/inference pipeline: 256
  • Number of aspect categories: 13

Label Schema

  • 0: aspect not mentioned
  • 1: positive
  • 2: negative
  • 3: neutral

Aspect Categories

  • Cơ sở vật chất#Chất lượng
  • Cơ sở vật chất#Khác
  • Cơ sở vật chất#Không gian
  • Cơ sở vật chất#Vệ sinh
  • Nhân viên y tế#Chất lượng
  • Nhân viên y tế#Khác
  • Nhân viên y tế#Thái độ
  • Trải nghiệm chung#Chất lượng
  • Trải nghiệm chung#Giá
  • Trải nghiệm chung#Khác
  • Trải nghiệm chung#Không gian
  • Trải nghiệm chung#Thái độ
  • Trải nghiệm chung#Vệ sinh

Dataset

  • Dataset: HosRev HosRev is a Vietnamese hospital review dataset for Aspect-Category Sentiment Analysis (ACSA). The data contains reviews of hospitals in Ho Chi Minh City with aspect-category sentiment labels.

Data Format

  • Review is the input text column.
  • Each remaining column is one aspect-category label encoded as 0/1/2/3.

Splits

  • Train: 4566 samples
  • Validation: 978 samples
  • Test: 979 samples

Checkpoint Metrics

  • loss: 0.2834
  • accuracy: 0.8970

Usage

Load the model with trust_remote_code=True because this repository contains custom modeling code.

from transformers import AutoModelForSequenceClassification, AutoTokenizer

repo_id = "NeoCyber/m-e5-small-hosrev"
tokenizer = AutoTokenizer.from_pretrained(repo_id)
model = AutoModelForSequenceClassification.from_pretrained(
    repo_id,
    trust_remote_code=True,
)

texts = ["Bác sĩ nhiệt tình và giải thích rất dễ hiểu."]
inputs = tokenizer(texts, return_tensors="pt", truncation=True, padding=True)
outputs = model(**inputs)
predictions = model.decode_predictions(outputs.logits)
print(predictions)

Notes

  • The repository includes custom configuration_*.py and modeling_*.py files required by transformers AutoClasses.
  • outputs.logits has shape [batch_size, num_aspects, 4] and model.decode_predictions(...) maps logits back to aspect-level labels.
Downloads last month
26
Safetensors
Model size
0.1B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for NeoCyber/m-e5-small-hosrev

Finetuned
(161)
this model