File size: 4,232 Bytes
ce13b0a
 
d626dd3
 
 
 
 
 
 
 
 
ce13b0a
d626dd3
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
---
license: mit
language:
- en
metrics:
- accuracy
- precision
- recall
- f1
tags:
- code
---

# Model Card for Sentiment Analysis on Primate Dataset

This model card provides details about a sentiment analysis model trained on a dataset containing posts related to primates. The model predicts sentiment labels for textual data using transformer-based architectures.

## Model Details

### Model Description

The sentiment analysis model aims to classify text data into sentiment categories such as positive, negative, or neutral. It utilizes transformer-based architectures for sequence classification.

- **Developed by:** Jaskaran Singh
- **Model type:** Transformer-based sentiment analysis model
- **Language(s) (NLP):** English
- **License:** MIT
- **Finetuned from model:** Transformer-based pre-trained model

### Model Sources

- **Repository:** https://github.com/JaskaranSingh-01/Sentiment_Analyzer
- **Demo:** https://sentimentanalyzer-f76oxwautwypxpea4lj3wg.streamlit.app/

## Uses

### Direct Use

The model can be directly used for sentiment analysis tasks, particularly on textual data related to primates.

### Downstream Use

The model can be fine-tuned for specific downstream tasks or integrated into larger applications requiring sentiment analysis functionality.

## Bias, Risks, and Limitations

### Bias

The model's predictions may reflect biases present in the training data, including any biases related to primates or sentiment labeling.

### Risks

- Misclassification: The model may misclassify sentiment due to ambiguity or complexity in the text.
- Generalization: The model's performance may vary across different domains or datasets.

### Limitations

- Limited Domain: The model's effectiveness may be limited to text related to primates.
- Cultural Bias: The model's performance may be influenced by cultural nuances present in the training data.

## Recommendations

Users should be cautious when interpreting the model's predictions, considering potential biases and limitations. Fine-tuning on domain-specific data or applying post-processing techniques may help mitigate biases and improve performance.

## How to Get Started with the Model

```python
# Example code for using the sentiment analysis model

# 1. Load the model and tokenizer
from transformers import AutoTokenizer, AutoModelForSequenceClassification

tokenizer = AutoTokenizer.from_pretrained("sbcBI/sentiment_analysis_model")
model = AutoModelForSequenceClassification.from_pretrained("sbcBI/sentiment_analysis_model")

# 2. Tokenize input text
text = "Sample text for sentiment analysis"
encoded_input = tokenizer(text, return_tensors='pt')

# 3. Perform inference
output = model(**encoded_input)
predicted_label = output.logits.argmax().item()

# 4. Interpret prediction
sentiment_labels = ['Negative', 'Neutral', 'Positive']
print("Predicted Sentiment:", sentiment_labels[predicted_label])
```

## Training Details

### Training Data

The training data consists of posts related to primates, annotated with sentiment labels.

### Training Procedure

#### Preprocessing

Text data underwent preprocessing steps including lowercase conversion, punctuation removal, tokenization, stopword removal, and stemming.

#### Training Hyperparameters

- **Training regime:** Fine-tuning of transformer-based pre-trained model
- **Optimizer:** Adam optimizer
- **Learning rate:** 5e-5
- **Batch size:** 8
- **Epochs:** 10

### Evaluation

#### Testing Data, Factors & Metrics

- **Testing Data:** Holdout test set
- **Metrics:** Accuracy, Precision, Recall, F1-score

#### Results

- **Accuracy:** 0.79
- **Precision:** 0.74
- **Recall:** 0.77
- **F1-score:** 0.75

## Environmental Impact

Carbon emissions were not directly measured for model training. However, users should consider the environmental impact of training and deploying machine learning models, especially on large-scale infrastructure.

## Technical Specifications

### Model Architecture and Objective

The model architecture is based on transformer-based architectures, specifically designed for sequence classification tasks such as sentiment analysis.

### Compute Infrastructure

#### Software

- **Framework:** PyTorch
- **Dependencies:** Transformers, NLTK