| SentimentBERT | |
| Overview | |
| SentimentBERT is a BERT-based model fine-tuned for sentiment analysis on text data. It can classify text into three categories: NEGATIVE, NEUTRAL, and POSITIVE. The model has been trained on a diverse dataset of social media posts, product reviews, and news articles. | |
| Model Architecture | |
| This model is based on the BERT (Bidirectional Encoder Representations from Transformers) architecture. It consists of 12 transformer layers, 12 attention heads, and a hidden size of 768. The model has been fine-tuned on a large dataset of labeled sentiment data spanning multiple domains and languages. | |
| Intended Use | |
| This model is designed to analyze text and determine the sentiment expressed within it. It can be used for: | |
| Social media monitoring and brand reputation management | |
| Customer feedback analysis and sentiment tracking | |
| Market research and consumer opinion analysis | |
| Content moderation and filtering | |
| Financial market sentiment analysis | |
| Limitations | |
| The model may not perform well on domain-specific jargon or slang not present in the training data | |
| It may struggle with detecting sarcasm, irony, and nuanced emotional expressions | |
| Performance may vary across different languages and cultural contexts | |
| The model's performance may degrade on very short texts (less than 3 words) or extremely long texts | |
| The model may have biases present in the training data | |
| Example Code | |
| from transformers import AutoTokenizer, AutoModelForSequenceClassificationimport torch# Load model and tokenizertokenizer = AutoTokenizer.from_pretrained("username/sentimentbert")model = AutoModelForSequenceClassification.from_pretrained("username/sentimentbert")# Function to predict sentimentdef predict_sentiment(text): inputs = tokenizer(text, return_tensors="pt", padding=True, truncation=True, max_length=512) with torch.no_grad(): outputs = model(**inputs) predictions = torch.nn.functional.softmax(outputs.logits, dim=-1) predicted_class = torch.argmax(predictions, dim=1).item() confidence = predictions[0][predicted_class].item() sentiment_map = {0: "NEGATIVE", 1: "NEUTRAL", 2: "POSITIVE"} return sentiment_map[predicted_class], confidence# Example usagetext = "I really enjoyed the movie, the acting was superb!"sentiment, confidence = predict_sentiment(text)print(f"Text: {text}")print(f"Predicted sentiment: {sentiment} (confidence: {confidence:.2f})") | |
| Training Data | |
| The model was trained on a combination of publicly available sentiment analysis datasets including: | |
| SST-2 (Stanford Sentiment Treebank) | |
| IMDB movie reviews | |
| Amazon product reviews | |
| Twitter sentiment datasets | |
| Customer feedback datasets from various industries | |
| Evaluation Results | |
| The model achieves the following performance metrics on a held-out test set: | |
| Accuracy: 92.3% | |
| F1 Score (macro): 0.91 | |
| Precision (macro): 0.92 | |
| Recall (macro): 0.91 | |