File size: 4,232 Bytes
ce13b0a d626dd3 ce13b0a d626dd3 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 |
---
license: mit
language:
- en
metrics:
- accuracy
- precision
- recall
- f1
tags:
- code
---
# Model Card for Sentiment Analysis on Primate Dataset
This model card provides details about a sentiment analysis model trained on a dataset containing posts related to primates. The model predicts sentiment labels for textual data using transformer-based architectures.
## Model Details
### Model Description
The sentiment analysis model aims to classify text data into sentiment categories such as positive, negative, or neutral. It utilizes transformer-based architectures for sequence classification.
- **Developed by:** Jaskaran Singh
- **Model type:** Transformer-based sentiment analysis model
- **Language(s) (NLP):** English
- **License:** MIT
- **Finetuned from model:** Transformer-based pre-trained model
### Model Sources
- **Repository:** https://github.com/JaskaranSingh-01/Sentiment_Analyzer
- **Demo:** https://sentimentanalyzer-f76oxwautwypxpea4lj3wg.streamlit.app/
## Uses
### Direct Use
The model can be directly used for sentiment analysis tasks, particularly on textual data related to primates.
### Downstream Use
The model can be fine-tuned for specific downstream tasks or integrated into larger applications requiring sentiment analysis functionality.
## Bias, Risks, and Limitations
### Bias
The model's predictions may reflect biases present in the training data, including any biases related to primates or sentiment labeling.
### Risks
- Misclassification: The model may misclassify sentiment due to ambiguity or complexity in the text.
- Generalization: The model's performance may vary across different domains or datasets.
### Limitations
- Limited Domain: The model's effectiveness may be limited to text related to primates.
- Cultural Bias: The model's performance may be influenced by cultural nuances present in the training data.
## Recommendations
Users should be cautious when interpreting the model's predictions, considering potential biases and limitations. Fine-tuning on domain-specific data or applying post-processing techniques may help mitigate biases and improve performance.
## How to Get Started with the Model
```python
# Example code for using the sentiment analysis model
# 1. Load the model and tokenizer
from transformers import AutoTokenizer, AutoModelForSequenceClassification
tokenizer = AutoTokenizer.from_pretrained("sbcBI/sentiment_analysis_model")
model = AutoModelForSequenceClassification.from_pretrained("sbcBI/sentiment_analysis_model")
# 2. Tokenize input text
text = "Sample text for sentiment analysis"
encoded_input = tokenizer(text, return_tensors='pt')
# 3. Perform inference
output = model(**encoded_input)
predicted_label = output.logits.argmax().item()
# 4. Interpret prediction
sentiment_labels = ['Negative', 'Neutral', 'Positive']
print("Predicted Sentiment:", sentiment_labels[predicted_label])
```
## Training Details
### Training Data
The training data consists of posts related to primates, annotated with sentiment labels.
### Training Procedure
#### Preprocessing
Text data underwent preprocessing steps including lowercase conversion, punctuation removal, tokenization, stopword removal, and stemming.
#### Training Hyperparameters
- **Training regime:** Fine-tuning of transformer-based pre-trained model
- **Optimizer:** Adam optimizer
- **Learning rate:** 5e-5
- **Batch size:** 8
- **Epochs:** 10
### Evaluation
#### Testing Data, Factors & Metrics
- **Testing Data:** Holdout test set
- **Metrics:** Accuracy, Precision, Recall, F1-score
#### Results
- **Accuracy:** 0.79
- **Precision:** 0.74
- **Recall:** 0.77
- **F1-score:** 0.75
## Environmental Impact
Carbon emissions were not directly measured for model training. However, users should consider the environmental impact of training and deploying machine learning models, especially on large-scale infrastructure.
## Technical Specifications
### Model Architecture and Objective
The model architecture is based on transformer-based architectures, specifically designed for sequence classification tasks such as sentiment analysis.
### Compute Infrastructure
#### Software
- **Framework:** PyTorch
- **Dependencies:** Transformers, NLTK
|