SentimentBERT Overview
SentimentBERT is a BERT-based model fine-tuned for sentiment analysis on text data. It can classify text into three categories: NEGATIVE, NEUTRAL, and POSITIVE. The model has been trained on a diverse dataset of social media posts, product reviews, and news articles. Model Architecture
This model is based on the BERT (Bidirectional Encoder Representations from Transformers) architecture. It consists of 12 transformer layers, 12 attention heads, and a hidden size of 768. The model has been fine-tuned on a large dataset of labeled sentiment data spanning multiple domains and languages. Intended Use
This model is designed to analyze text and determine the sentiment expressed within it. It can be used for:
Social media monitoring and brand reputation management
Customer feedback analysis and sentiment tracking
Market research and consumer opinion analysis
Content moderation and filtering
Financial market sentiment analysis
Limitations
The model may not perform well on domain-specific jargon or slang not present in the training data
It may struggle with detecting sarcasm, irony, and nuanced emotional expressions
Performance may vary across different languages and cultural contexts
The model's performance may degrade on very short texts (less than 3 words) or extremely long texts
The model may have biases present in the training data
Example Code
from transformers import AutoTokenizer, AutoModelForSequenceClassificationimport torch# Load model and tokenizertokenizer = AutoTokenizer.from_pretrained("username/sentimentbert")model = AutoModelForSequenceClassification.from_pretrained("username/sentimentbert")# Function to predict sentimentdef predict_sentiment(text): inputs = tokenizer(text, return_tensors="pt", padding=True, truncation=True, max_length=512) with torch.no_grad(): outputs = model(**inputs) predictions = torch.nn.functional.softmax(outputs.logits, dim=-1) predicted_class = torch.argmax(predictions, dim=1).item() confidence = predictions[0][predicted_class].item() sentiment_map = {0: "NEGATIVE", 1: "NEUTRAL", 2: "POSITIVE"} return sentiment_map[predicted_class], confidence# Example usagetext = "I really enjoyed the movie, the acting was superb!"sentiment, confidence = predict_sentiment(text)print(f"Text: {text}")print(f"Predicted sentiment: {sentiment} (confidence: {confidence:.2f})")
Training Data
The model was trained on a combination of publicly available sentiment analysis datasets including:
SST-2 (Stanford Sentiment Treebank)
IMDB movie reviews
Amazon product reviews
Twitter sentiment datasets
Customer feedback datasets from various industries
Evaluation Results
The model achieves the following performance metrics on a held-out test set:
Accuracy: 92.3%
F1 Score (macro): 0.91
Precision (macro): 0.92
Recall (macro): 0.91