YamenRM
/

Sentiment_model

Text Classification

sentiment-analysis

Eval Results (legacy)

text-embeddings-inference

Model card Files Files and versions

Sentiment_model / README.md

YamenRM's picture

Update README.md

d64d0eb verified 6 months ago

|

history blame contribute delete

2.4 kB

	---
	language: en
	tags:
	- sentiment-analysis
	- text-classification
	- transformers
	- distilbert
	datasets:
	- lakshmi25npathi/imdb-dataset-of-50k-movie-reviews
	model-index:
	- name: DistilBERT Sentiment Classifier
	results:
	- task:
	type: text-classification
	name: Sentiment Analysis
	dataset:
	name: IMDB Dataset of 50K Movie Reviews
	type: text
	metrics:
	- name: Accuracy
	type: accuracy
	value: 0.93
	- name: F1
	type: f1
	value: 0.93
	- name: Precision
	type: precision
	value: 0.93
	- name: Recall
	type: recall
	value: 0.93
	license: apache-2.0
	metrics:
	- accuracy
	- precision
	- recall
	---


	# DistilBERT Sentiment Classifier
	## Model Details

	- Model Type: Transformer-based classifier (DistilBERT)

	- Base Model: distilbert-base-uncased

	- Language: English

	- Task: Sentiment Analysis (binary classification)

	Labels:

	0 → Negative

	1 → Positive

	Framework: Hugging Face Transformers

	## Intended Uses & Limitations

	#### Intended Use:

	Sentiment classification of English reviews, comments, or feedback.

	Not Intended Use:

	Other languages.

	Multi-label sentiment tasks (neutral/mixed).

	## ⚠️ Limitations:

	- May not generalize well outside movie/review-style data.

	- Training data may contain cultural and linguistic bias.

	## Training Dataset

	- Source: Kaggle Cleaned IMDB Reviews Dataset

	- Size: ~50,000 reviews

	- Classes: positive, negative

	- Converted to integers: positive → 1, negative → 0

	## Training Procedure

	- Epochs: 3

	- Batch Size: 16

	- Optimizer: AdamW

	- Learning Rate: 5e-5

	- Framework: Hugging Face Trainer API

	## Evaluation

	The model was tested on a held-out validation set of 9,917 reviews.

	Class Precision Recall F1-score Support
	Negative (0) 0.93 0.93 0.93 4,939
	Positive (1) 0.93 0.93 0.93 4,978

	## Overall

	- Accuracy: 93%

	- Macro Avg F1: 0.93

	- Weighted Avg F1: 0.93


	## How to Use
	```
	from transformers import AutoTokenizer, AutoModelForSequenceClassification, pipeline

	model_name = "YamenRM/distilbert-sentiment-classifier"

	tokenizer = AutoTokenizer.from_pretrained(model_name)
	model = AutoModelForSequenceClassification.from_pretrained(model_name)

	nlp = pipeline("text-classification", model=model, tokenizer=tokenizer)

	print(nlp("I really loved this movie, it was amazing!"))
	```
	```
	# [{'label': 'POSITIVE', 'score': 0.98}]
	```