File size: 2,399 Bytes
60b1cbf ff4528e 46a9b34 ff4528e 46a9b34 ff4528e 46a9b34 ff4528e 60b1cbf ff4528e 46a9b34 60b1cbf 46a9b34 60b1cbf 46a9b34 60b1cbf 46a9b34 60b1cbf 46a9b34 60b1cbf 46a9b34 60b1cbf 46a9b34 60b1cbf 46a9b34 60b1cbf 46a9b34 60b1cbf 46a9b34 60b1cbf 46a9b34 60b1cbf 46a9b34 60b1cbf 46a9b34 60b1cbf 46a9b34 60b1cbf 46a9b34 60b1cbf 46a9b34 60b1cbf 46a9b34 60b1cbf 46a9b34 60b1cbf 46a9b34 60b1cbf 46a9b34 60b1cbf 46a9b34 60b1cbf 46a9b34 60b1cbf 46a9b34 60b1cbf 46a9b34 60b1cbf 46a9b34 60b1cbf 46a9b34 60b1cbf 46a9b34 60b1cbf 46a9b34 60b1cbf 46a9b34 60b1cbf 46a9b34 60b1cbf 46a9b34 60b1cbf 46a9b34 60b1cbf 46a9b34 60b1cbf 46a9b34 60b1cbf 46a9b34 60b1cbf 46a9b34 60b1cbf d64d0eb 60b1cbf 46a9b34 60b1cbf 46a9b34 60b1cbf 46a9b34 ff4528e | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 | ---
language: en
tags:
- sentiment-analysis
- text-classification
- transformers
- distilbert
datasets:
- lakshmi25npathi/imdb-dataset-of-50k-movie-reviews
model-index:
- name: DistilBERT Sentiment Classifier
results:
- task:
type: text-classification
name: Sentiment Analysis
dataset:
name: IMDB Dataset of 50K Movie Reviews
type: text
metrics:
- name: Accuracy
type: accuracy
value: 0.93
- name: F1
type: f1
value: 0.93
- name: Precision
type: precision
value: 0.93
- name: Recall
type: recall
value: 0.93
license: apache-2.0
metrics:
- accuracy
- precision
- recall
---
# DistilBERT Sentiment Classifier
## Model Details
- Model Type: Transformer-based classifier (DistilBERT)
- Base Model: distilbert-base-uncased
- Language: English
- Task: Sentiment Analysis (binary classification)
**Labels:**
0 → Negative
1 → Positive
Framework: Hugging Face Transformers
## Intended Uses & Limitations
#### Intended Use:
Sentiment classification of English reviews, comments, or feedback.
Not Intended Use:
Other languages.
Multi-label sentiment tasks (neutral/mixed).
## ⚠️ Limitations:
- May not generalize well outside movie/review-style data.
- Training data may contain cultural and linguistic bias.
## Training Dataset
- Source: Kaggle Cleaned IMDB Reviews Dataset
- Size: ~50,000 reviews
- Classes: positive, negative
- Converted to integers: positive → 1, negative → 0
## Training Procedure
- Epochs: 3
- Batch Size: 16
- Optimizer: AdamW
- Learning Rate: 5e-5
- Framework: Hugging Face Trainer API
## Evaluation
The model was tested on a held-out validation set of 9,917 reviews.
Class Precision Recall F1-score Support
Negative (0) 0.93 0.93 0.93 4,939
Positive (1) 0.93 0.93 0.93 4,978
## Overall
- Accuracy: 93%
- Macro Avg F1: 0.93
- Weighted Avg F1: 0.93
## How to Use
```
from transformers import AutoTokenizer, AutoModelForSequenceClassification, pipeline
model_name = "YamenRM/distilbert-sentiment-classifier"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSequenceClassification.from_pretrained(model_name)
nlp = pipeline("text-classification", model=model, tokenizer=tokenizer)
print(nlp("I really loved this movie, it was amazing!"))
```
```
# [{'label': 'POSITIVE', 'score': 0.98}]
``` |