metadata
language: multilingual
license: apache-2.0
tags:
- sentiment-analysis
- text-classification
- xlm-roberta
- amazon-reviews
datasets:
- amazon-reviews
metrics:
- accuracy
model-index:
- name: anpmts/sentiment-classifier
results:
- task:
type: text-classification
name: Sentiment Analysis
dataset:
type: amazon-reviews
name: Amazon Reviews
metrics:
- type: accuracy
value: 0.924
name: Validation Accuracy
Sentiment Classifier - XLM-RoBERTa
This is a sentiment classification model fine-tuned on Amazon Reviews dataset.
Model Description
- Base Model: xlm-roberta-base
- Task: Sentiment Classification (negative/neutral/positive)
- Architecture: Sequence Classification (single-head)
- Languages: Multilingual (100+ languages)
- Parameters: 278M
Training Data
- Dataset: Amazon Reviews (Kaggle)
- Training Samples: 8,500
- Validation Samples: 1,500
- Test Samples: 5,000
Performance
| Metric | Value |
|---|---|
| Validation Accuracy | 92.4% |
| Training Accuracy | 85.4% |
| Validation Loss | 0.179 |
Training Details
- Epochs: 10
- Batch Size: 16
- Learning Rate: 2e-5
- Mixed Precision: FP16
- Optimizer: AdamW
- Scheduler: Linear Warmup + Cosine Decay
Usage
Option 1: Using AutoModelForSequenceClassification (Recommended)
First, make sure the custom model is registered by installing from this repository:
# If loading from HuggingFace Hub, you need to install trust_remote_code
from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch
# Load model and tokenizer
model_name = "anpmts/sentiment-classifier"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSequenceClassification.from_pretrained(
model_name,
trust_remote_code=True # Required for custom models
)
# Prepare input
text = "This product is amazing! Highly recommend."
inputs = tokenizer(text, return_tensors="pt", padding=True, truncation=True, max_length=256)
# Get prediction
with torch.no_grad():
outputs = model(**inputs)
predictions = torch.nn.functional.softmax(outputs.logits, dim=-1)
sentiment = torch.argmax(predictions, dim=-1)
# Map to label
labels = ["negative", "neutral", "positive"]
print(f"Sentiment: {labels[sentiment.item()]}")
print(f"Confidence: {predictions[0][sentiment].item():.2%}")
Option 2: Using Pipeline (Easiest)
from transformers import pipeline
# Load sentiment analysis pipeline
classifier = pipeline(
"text-classification",
model="anpmts/sentiment-classifier",
trust_remote_code=True
)
# Predict
result = classifier("This product is amazing! Highly recommend.")
print(result)
# Output: [{'label': 'positive', 'score': 0.96}]
Option 3: Direct Model Loading
from transformers import AutoTokenizer
import torch
# You need to have the model code available locally
from src.models import SentimentClassifier
model = SentimentClassifier.from_pretrained("anpmts/sentiment-classifier")
tokenizer = AutoTokenizer.from_pretrained("anpmts/sentiment-classifier")
# Inference
text = "This product is amazing!"
inputs = tokenizer(text, return_tensors="pt", max_length=256, truncation=True, padding=True)
outputs = model(**inputs)
predictions = torch.softmax(outputs["logits"], dim=-1)
Training Metrics Over Epochs
| Epoch | Train Loss | Val Loss | Val Acc |
|---|---|---|---|
| 1 | 0.639 | 0.613 | 49.5% |
| 5 | 0.551 | 0.455 | 68.9% |
| 10 | 0.270 | 0.179 | 92.4% |
Citation
If you use this model, please cite:
@misc{sentiment-classifier-xlm-roberta,
author = {TrustShop},
title = {Sentiment Classifier - XLM-RoBERTa},
year = {2025},
publisher = {HuggingFace},
url = {https://huggingface.co/anpmts/sentiment-classifier}
}
License
Apache 2.0