File size: 4,106 Bytes
a53594e
1e635e0
a53594e
1e635e0
 
 
 
 
 
 
 
a53594e
 
 
 
1e635e0
 
 
 
 
 
 
 
 
 
 
 
 
a53594e
 
1e635e0
a53594e
1e635e0
a53594e
1e635e0
a53594e
1e635e0
 
 
 
 
 
 
 
a53594e
1e635e0
a53594e
1e635e0
 
 
 
 
 
a53594e
1e635e0
a53594e
1e635e0
 
 
 
 
 
 
 
a53594e
1e635e0
a53594e
1e635e0
a53594e
1e635e0
 
a53594e
1e635e0
 
 
 
a53594e
1e635e0
 
 
a53594e
1e635e0
a53594e
1e635e0
 
 
a53594e
1e635e0
 
 
a53594e
1e635e0
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
daf25f3
1e635e0
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
---
language: en
tags:
- sentiment-analysis
- text-classification
- roberta
- imdb
- pytorch
- transformers
datasets:
- imdb
metrics:
- accuracy
- f1
model-index:
- name: nkadoor/sentiment-classifier-roberta
  results:
  - task:
      type: text-classification
      name: Text Classification
    dataset:
      name: imdb
      type: imdb
    metrics:
    - type: accuracy
      value: 0.9590
    - type: f1
      value: 0.9791
---

# Fine-tuned Sentiment Classification Model

This model is a fine-tuned version of [roberta-base](https://huggingface.co/roberta-base) for sentiment analysis on movie reviews.

## Model Details

- **Model type:** Text Classification (Sentiment Analysis)
- **Base model:** roberta-base
- **Language:** English
- **Task:** Binary sentiment classification (positive/negative)
- **Training dataset:** IMDB Movie Reviews Dataset
- **Training samples:** 5000 samples
- **Validation samples:** 1000 samples
- **Test samples:** 1000 samples

## Performance

| Metric | Value |
|--------|-------|
| Test Accuracy | 0.9590 |
| Test F1 Score | 0.9791 |
| Test Precision | 1.0000 |
| Test Recall | 0.9590 |

## Training Details

| Parameter | Value |
|-----------|-------|
| Training epochs | 3 |
| Batch size | 16 |
| Learning rate | 5e-05 |
| Warmup steps | 500 |
| Weight decay | 0.01 |
| Max sequence length | 512 |

## Usage

### Quick Start

```python
from transformers import AutoTokenizer, AutoModelForSequenceClassification, pipeline

# Using pipeline (recommended for quick inference)
classifier = pipeline("sentiment-analysis", 
                     model="nkadoor/sentiment-classifier-roberta",
                     tokenizer="nkadoor/sentiment-classifier-roberta")

result = classifier("This movie was amazing!")
print(result)  # [{'label': 'POSITIVE', 'score': 0.99}]
```

### Manual Usage

```python
from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch

# Load model and tokenizer
tokenizer = AutoTokenizer.from_pretrained("nkadoor/sentiment-classifier-roberta")
model = AutoModelForSequenceClassification.from_pretrained("nkadoor/sentiment-classifier-roberta")

def predict_sentiment(text):
    inputs = tokenizer(text, return_tensors="pt", truncation=True, padding=True, max_length=512)
    
    with torch.no_grad():
        outputs = model(**inputs)
        predictions = torch.nn.functional.softmax(outputs.logits, dim=-1)
        predicted_class = torch.argmax(predictions, dim=-1).item()
        confidence = predictions[0][predicted_class].item()
    
    sentiment = "positive" if predicted_class == 1 else "negative"
    return sentiment, confidence

# Example usage
text = "This movie was absolutely fantastic!"
sentiment, confidence = predict_sentiment(text)
print(f"Sentiment: {sentiment} (Confidence: {confidence:.4f})")
```

## Dataset

The model was trained on the [IMDB Movie Reviews Dataset](https://huggingface.co/datasets/imdb), which contains movie reviews labeled as positive or negative sentiment. The dataset consists of:

- 25,000 training reviews
- 25,000 test reviews
- Balanced distribution of positive and negative sentiments

## Intended Use

This model is intended for sentiment analysis of English movie reviews or similar text. It can be used to:

- Analyze sentiment in movie reviews
- Classify text as positive or negative
- Build sentiment analysis applications
- Research in sentiment analysis

## Limitations

- Trained specifically on movie reviews, may not generalize well to other domains
- Limited to English language
- Binary classification only (positive/negative)
- May reflect biases present in the training data

## Citation

If you use this model, please cite:

```bibtex
@misc{sentiment-classifier-roberta,
  title={Fine-tuned RoBERTa for Sentiment Analysis},
  author={Narayana Kadoor},
  year={2025},
  url={https://huggingface.co/nkadoor/sentiment-classifier-roberta}
}
```

## Training Logs

Final training metrics:
- Final training loss: N/A
- Best validation F1: 0.9791
- Total training time: 3.0 epochs completed

---

*Model trained using Transformers library by Hugging Face*