Upload sentiment classifier trained on Amazon Reviews

26f3ae9 verified 6 months ago

3.94 kB

language: multilingual
license: apache-2.0
tags:
  - sentiment-analysis
  - text-classification
  - xlm-roberta
  - amazon-reviews
datasets:
  - amazon-reviews
metrics:
  - accuracy
model-index:
  - name: anpmts/sentiment-classifier
    results:
      - task:
          type: text-classification
          name: Sentiment Analysis
        dataset:
          type: amazon-reviews
          name: Amazon Reviews
        metrics:
          - type: accuracy
            value: 0.924
            name: Validation Accuracy

Sentiment Classifier - XLM-RoBERTa

This is a sentiment classification model fine-tuned on Amazon Reviews dataset.

Model Description

Base Model: xlm-roberta-base
Task: Sentiment Classification (negative/neutral/positive)
Architecture: Sequence Classification (single-head)
Languages: Multilingual (100+ languages)
Parameters: 278M

Training Data

Dataset: Amazon Reviews (Kaggle)
Training Samples: 8,500
Validation Samples: 1,500
Test Samples: 5,000

Performance

Metric	Value
Validation Accuracy	92.4%
Training Accuracy	85.4%
Validation Loss	0.179

Training Details

Epochs: 10
Batch Size: 16
Learning Rate: 2e-5
Mixed Precision: FP16
Optimizer: AdamW
Scheduler: Linear Warmup + Cosine Decay

Usage

Option 1: Using AutoModelForSequenceClassification (Recommended)

First, make sure the custom model is registered by installing from this repository:

# If loading from HuggingFace Hub, you need to install trust_remote_code
from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch

# Load model and tokenizer
model_name = "anpmts/sentiment-classifier"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSequenceClassification.from_pretrained(
    model_name,
    trust_remote_code=True  # Required for custom models
)

# Prepare input
text = "This product is amazing! Highly recommend."
inputs = tokenizer(text, return_tensors="pt", padding=True, truncation=True, max_length=256)

# Get prediction
with torch.no_grad():
    outputs = model(**inputs)
    predictions = torch.nn.functional.softmax(outputs.logits, dim=-1)
    sentiment = torch.argmax(predictions, dim=-1)

# Map to label
labels = ["negative", "neutral", "positive"]
print(f"Sentiment: {labels[sentiment.item()]}")
print(f"Confidence: {predictions[0][sentiment].item():.2%}")

Option 2: Using Pipeline (Easiest)

from transformers import pipeline

# Load sentiment analysis pipeline
classifier = pipeline(
    "text-classification",
    model="anpmts/sentiment-classifier",
    trust_remote_code=True
)

# Predict
result = classifier("This product is amazing! Highly recommend.")
print(result)
# Output: [{'label': 'positive', 'score': 0.96}]

Option 3: Direct Model Loading

from transformers import AutoTokenizer
import torch

# You need to have the model code available locally
from src.models import SentimentClassifier

model = SentimentClassifier.from_pretrained("anpmts/sentiment-classifier")
tokenizer = AutoTokenizer.from_pretrained("anpmts/sentiment-classifier")

# Inference
text = "This product is amazing!"
inputs = tokenizer(text, return_tensors="pt", max_length=256, truncation=True, padding=True)
outputs = model(**inputs)
predictions = torch.softmax(outputs["logits"], dim=-1)

Training Metrics Over Epochs

Epoch	Train Loss	Val Loss	Val Acc
1	0.639	0.613	49.5%
5	0.551	0.455	68.9%
10	0.270	0.179	92.4%

Citation

If you use this model, please cite:

@misc{sentiment-classifier-xlm-roberta,
  author = {TrustShop},
  title = {Sentiment Classifier - XLM-RoBERTa},
  year = {2025},
  publisher = {HuggingFace},
  url = {https://huggingface.co/anpmts/sentiment-classifier}
}

License

Apache 2.0