anpmts's picture
Upload sentiment classifier trained on Amazon Reviews
26f3ae9 verified
|
raw
history blame
3.94 kB
metadata
language: multilingual
license: apache-2.0
tags:
  - sentiment-analysis
  - text-classification
  - xlm-roberta
  - amazon-reviews
datasets:
  - amazon-reviews
metrics:
  - accuracy
model-index:
  - name: anpmts/sentiment-classifier
    results:
      - task:
          type: text-classification
          name: Sentiment Analysis
        dataset:
          type: amazon-reviews
          name: Amazon Reviews
        metrics:
          - type: accuracy
            value: 0.924
            name: Validation Accuracy

Sentiment Classifier - XLM-RoBERTa

This is a sentiment classification model fine-tuned on Amazon Reviews dataset.

Model Description

  • Base Model: xlm-roberta-base
  • Task: Sentiment Classification (negative/neutral/positive)
  • Architecture: Sequence Classification (single-head)
  • Languages: Multilingual (100+ languages)
  • Parameters: 278M

Training Data

  • Dataset: Amazon Reviews (Kaggle)
  • Training Samples: 8,500
  • Validation Samples: 1,500
  • Test Samples: 5,000

Performance

Metric Value
Validation Accuracy 92.4%
Training Accuracy 85.4%
Validation Loss 0.179

Training Details

  • Epochs: 10
  • Batch Size: 16
  • Learning Rate: 2e-5
  • Mixed Precision: FP16
  • Optimizer: AdamW
  • Scheduler: Linear Warmup + Cosine Decay

Usage

Option 1: Using AutoModelForSequenceClassification (Recommended)

First, make sure the custom model is registered by installing from this repository:

# If loading from HuggingFace Hub, you need to install trust_remote_code
from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch

# Load model and tokenizer
model_name = "anpmts/sentiment-classifier"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSequenceClassification.from_pretrained(
    model_name,
    trust_remote_code=True  # Required for custom models
)

# Prepare input
text = "This product is amazing! Highly recommend."
inputs = tokenizer(text, return_tensors="pt", padding=True, truncation=True, max_length=256)

# Get prediction
with torch.no_grad():
    outputs = model(**inputs)
    predictions = torch.nn.functional.softmax(outputs.logits, dim=-1)
    sentiment = torch.argmax(predictions, dim=-1)

# Map to label
labels = ["negative", "neutral", "positive"]
print(f"Sentiment: {labels[sentiment.item()]}")
print(f"Confidence: {predictions[0][sentiment].item():.2%}")

Option 2: Using Pipeline (Easiest)

from transformers import pipeline

# Load sentiment analysis pipeline
classifier = pipeline(
    "text-classification",
    model="anpmts/sentiment-classifier",
    trust_remote_code=True
)

# Predict
result = classifier("This product is amazing! Highly recommend.")
print(result)
# Output: [{'label': 'positive', 'score': 0.96}]

Option 3: Direct Model Loading

from transformers import AutoTokenizer
import torch

# You need to have the model code available locally
from src.models import SentimentClassifier

model = SentimentClassifier.from_pretrained("anpmts/sentiment-classifier")
tokenizer = AutoTokenizer.from_pretrained("anpmts/sentiment-classifier")

# Inference
text = "This product is amazing!"
inputs = tokenizer(text, return_tensors="pt", max_length=256, truncation=True, padding=True)
outputs = model(**inputs)
predictions = torch.softmax(outputs["logits"], dim=-1)

Training Metrics Over Epochs

Epoch Train Loss Val Loss Val Acc
1 0.639 0.613 49.5%
5 0.551 0.455 68.9%
10 0.270 0.179 92.4%

Citation

If you use this model, please cite:

@misc{sentiment-classifier-xlm-roberta,
  author = {TrustShop},
  title = {Sentiment Classifier - XLM-RoBERTa},
  year = {2025},
  publisher = {HuggingFace},
  url = {https://huggingface.co/anpmts/sentiment-classifier}
}

License

Apache 2.0