| | --- |
| | license: mit |
| | datasets: |
| | - stanfordnlp/imdb |
| | language: |
| | - en |
| | metrics: |
| | - accuracy |
| | base_model: |
| | - google-bert/bert-base-uncased |
| | pipeline_tag: text-classification |
| | library_name: transformers |
| | tags: |
| | - code |
| | - sentiment-analysis |
| | - bert |
| | - imdb |
| | - text-classification |
| | - nlp |
| | --- |
| | # BERT IMDb Sentiment Analysis Model |
| |
|
| | This repository contains a fine-tuned BERT model for sentiment analysis on IMDb movie reviews. The model classifies text as either **Positive** or **Negative** sentiment. |
| | ## Live Demo: https://huggingface.co/spaces/philipobiorah/bert-sentiment-analysis |
| | ## Model Details |
| | - **Base Model**: `bert-base-uncased` |
| | - **Dataset**: IMDb Movie Reviews |
| | - **Task**: Sentiment Analysis (Binary Classification) |
| | - **Fine-tuned on**: IMDb dataset |
| | - **Labels**: |
| | - `0`: Negative |
| | - `1`: Positive |
| | <!-- ## Evaluation |
| | | **Model** | **SST-2 Accuracy** | **Yelp Accuracy** | **Amazon Accuracy** | **IMDB Accuracy** | |
| | |-----------------------------------|------------------|------------------|------------------|------------------| |
| | | **philipobiorah/bert-imdb-model** | **0.89** | **0.89** | **0.89** | **0.96** | |
| | | **DistilBERT-SST-2** | **0.94** | **0.85** | **0.85** | **0.89** | |
| | | **RoBERTa-Sentiment** | **0.40** | **0.42** | **0.47** | **0.79** | |
| | | **Logistic Regression** | **0.83** | **0.91** | **0.86** | **0.85** | |
| | | **Naive Bayes** | **0.77** | **0.86** | **0.84** | **0.85** | |
| | --> |
| | ## Usage |
| | ### **Load the Model in Python** |
| | ```python |
| | from transformers import BertTokenizer, BertForSequenceClassification |
| | import torch |
| | |
| | model_name = "philipobiorah/bert-imdb-model" |
| | |
| | # Load tokenizer and model |
| | tokenizer = BertTokenizer.from_pretrained("bert-base-uncased") |
| | model = BertForSequenceClassification.from_pretrained(model_name) |
| | |
| | # Define function for sentiment prediction with confidence score |
| | def predict_sentiment(text): |
| | inputs = tokenizer(text, return_tensors="pt", truncation=True, padding=True, max_length=512) |
| | |
| | with torch.no_grad(): |
| | logits = model(**inputs).logits |
| | |
| | # Convert logits to probabilities |
| | probabilities = torch.nn.functional.softmax(logits, dim=1)[0] |
| | |
| | # Get predicted class (0 = Negative, 1 = Positive) |
| | sentiment_idx = probabilities.argmax().item() |
| | confidence = probabilities[sentiment_idx].item() * 100 # Convert to percentage |
| | |
| | sentiment_label = "Positive" if sentiment_idx == 1 else "Negative" |
| | |
| | return {"sentiment": sentiment_label, "confidence": round(confidence, 2)} |
| | |
| | # Test the model |
| | result1 = predict_sentiment("This movie was absolutely fantastic!") |
| | result2 = predict_sentiment("I really disliked this movie, it was terrible.") |
| | |
| | print(f"Sentiment: {result1['sentiment']}, Confidence: {result1['confidence']}%") |
| | print(f"Sentiment: {result2['sentiment']}, Confidence: {result2['confidence']}%") |
| | |