Herbal-Sentiment-BERT

Model Description

This model is a fine-tuned version of bert-base-chinese for sentiment analysis in the specific domain of Chinese herbal medicine e-commerce reviews. It is designed to classify customer reviews into three sentiment categories: Negative (0), Neutral (1), and Positive (2).

The model was specifically optimized to handle highly imbalanced datasets (where positive samples dominate) by capturing deep semantic relationships and domain-specific terminology (e.g., specific herbal names and distinct symptom descriptions), effectively mitigating the risk of overfitting to the majority class.

Intended Uses & Limitations

  • Intended Use: Automated sentiment tagging for traditional Chinese medicine (TCM) product reviews, customer feedback analysis, and e-commerce rating systems.
  • Limitations: The model is trained exclusively on TCM-related texts. Its performance may degrade if applied to general domain texts or other highly specialized fields (e.g., modern electronics).

Training and Evaluation Data

The model was fine-tuned on a dataset comprising over 210,000 authentic user reviews from herbal medicine e-commerce platforms.

During the evaluation on a held-out test set (representing the imbalanced distribution), the model achieved the following performance, significantly outperforming sequence-based baselines (e.g., Bi-LSTM + Attention) in minority class identification:

  • Accuracy: 89.36%
  • Macro F1-Score: 77.08%

How to use

You can easily use this model with the transformers library:

from transformers import AutoTokenizer, AutoModelForSequenceClassification

tokenizer = AutoTokenizer.from_pretrained("1hugh/Herbal-Sentiment-BERT")
model = AutoModelForSequenceClassification.from_pretrained("1hugh/Herbal-Sentiment-BERT")

text = "这当归发霉了,味道极差!"
inputs = tokenizer(text, return_tensors="pt")

outputs = model(**inputs)
predictions = outputs.logits.argmax(dim=-1)

# Labels: 0 -> Negative, 1 -> Neutral, 2 -> Positive
print(f"Predicted class: {predictions.item()}")
Downloads last month
-
Safetensors
Model size
0.1B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Dataset used to train 1hugh/Herbal-Sentiment-BERT