File size: 2,399 Bytes

60b1cbf
ff4528e
46a9b34
 
 
 
 
 
ff4528e
46a9b34
 
 
 
 
 
 
ff4528e
46a9b34
 
 
 
 
 
 
 
 
 
 
 
 
 
ff4528e
 
 
 
 
60b1cbf
 
ff4528e
46a9b34
60b1cbf
 
46a9b34
60b1cbf
46a9b34
60b1cbf
46a9b34
60b1cbf
46a9b34
60b1cbf
46a9b34
60b1cbf
46a9b34
60b1cbf
46a9b34
60b1cbf
46a9b34
60b1cbf
46a9b34
60b1cbf
46a9b34
60b1cbf
46a9b34
60b1cbf
46a9b34
60b1cbf
46a9b34
60b1cbf
46a9b34
60b1cbf
46a9b34
60b1cbf
46a9b34
60b1cbf
46a9b34
60b1cbf
46a9b34
60b1cbf
46a9b34
60b1cbf
46a9b34
60b1cbf
46a9b34
60b1cbf
46a9b34
60b1cbf
46a9b34
60b1cbf
46a9b34
60b1cbf
46a9b34
60b1cbf
46a9b34
60b1cbf
46a9b34
60b1cbf
46a9b34
60b1cbf
 
 
46a9b34
60b1cbf
46a9b34
 
 
60b1cbf
46a9b34
60b1cbf
46a9b34
60b1cbf
46a9b34
60b1cbf
46a9b34
60b1cbf
 
46a9b34
 
 
60b1cbf
d64d0eb
60b1cbf
46a9b34
 
60b1cbf
46a9b34
60b1cbf
46a9b34
 
 
 
ff4528e

---
language: en
tags:
- sentiment-analysis
- text-classification
- transformers
- distilbert
datasets:
- lakshmi25npathi/imdb-dataset-of-50k-movie-reviews
model-index:
- name: DistilBERT Sentiment Classifier
  results:
  - task:
      type: text-classification
      name: Sentiment Analysis
    dataset:
      name: IMDB Dataset of 50K Movie Reviews
      type: text
    metrics:
    - name: Accuracy
      type: accuracy
      value: 0.93
    - name: F1
      type: f1
      value: 0.93
    - name: Precision
      type: precision
      value: 0.93
    - name: Recall
      type: recall
      value: 0.93
license: apache-2.0
metrics:
- accuracy
- precision
- recall
---


# DistilBERT Sentiment Classifier
## Model Details

 - Model Type: Transformer-based classifier (DistilBERT)

 - Base Model: distilbert-base-uncased

 - Language: English

 - Task: Sentiment Analysis (binary classification)

**Labels:**

0 → Negative

1 → Positive

Framework: Hugging Face Transformers

## Intended Uses & Limitations

#### Intended Use:

Sentiment classification of English reviews, comments, or feedback.

Not Intended Use:

Other languages.

Multi-label sentiment tasks (neutral/mixed).

## ⚠️ Limitations:

 - May not generalize well outside movie/review-style data.

 - Training data may contain cultural and linguistic bias.

## Training Dataset

 - Source: Kaggle Cleaned IMDB Reviews Dataset

 - Size: ~50,000 reviews

 - Classes: positive, negative

 - Converted to integers: positive → 1, negative → 0

## Training Procedure

 - Epochs: 3

 - Batch Size: 16

 - Optimizer: AdamW

 - Learning Rate: 5e-5

 - Framework: Hugging Face Trainer API

## Evaluation

The model was tested on a held-out validation set of 9,917 reviews.

Class	Precision	Recall	F1-score	Support
Negative (0)	0.93	0.93	0.93	4,939
Positive (1)	0.93	0.93	0.93	4,978

## Overall

 - Accuracy: 93%

 - Macro Avg F1: 0.93

 - Weighted Avg F1: 0.93


## How to Use
```
from transformers import AutoTokenizer, AutoModelForSequenceClassification, pipeline

model_name = "YamenRM/distilbert-sentiment-classifier"

tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSequenceClassification.from_pretrained(model_name)

nlp = pipeline("text-classification", model=model, tokenizer=tokenizer)

print(nlp("I really loved this movie, it was amazing!"))
```
```
# [{'label': 'POSITIVE', 'score': 0.98}]
```