|
|
--- |
|
|
license: mit |
|
|
language: |
|
|
- en |
|
|
metrics: |
|
|
- accuracy |
|
|
- precision |
|
|
- recall |
|
|
- f1 |
|
|
tags: |
|
|
- code |
|
|
--- |
|
|
|
|
|
# Model Card for Sentiment Analysis on Primate Dataset |
|
|
|
|
|
This model card provides details about a sentiment analysis model trained on a dataset containing posts related to primates. The model predicts sentiment labels for textual data using transformer-based architectures. |
|
|
|
|
|
## Model Details |
|
|
|
|
|
### Model Description |
|
|
|
|
|
The sentiment analysis model aims to classify text data into sentiment categories such as positive, negative, or neutral. It utilizes transformer-based architectures for sequence classification. |
|
|
|
|
|
- **Developed by:** Jaskaran Singh |
|
|
- **Model type:** Transformer-based sentiment analysis model |
|
|
- **Language(s) (NLP):** English |
|
|
- **License:** MIT |
|
|
- **Finetuned from model:** Transformer-based pre-trained model |
|
|
|
|
|
### Model Sources |
|
|
|
|
|
- **Repository:** https://github.com/JaskaranSingh-01/Sentiment_Analyzer |
|
|
- **Demo:** https://sentimentanalyzer-f76oxwautwypxpea4lj3wg.streamlit.app/ |
|
|
|
|
|
## Uses |
|
|
|
|
|
### Direct Use |
|
|
|
|
|
The model can be directly used for sentiment analysis tasks, particularly on textual data related to primates. |
|
|
|
|
|
### Downstream Use |
|
|
|
|
|
The model can be fine-tuned for specific downstream tasks or integrated into larger applications requiring sentiment analysis functionality. |
|
|
|
|
|
## Bias, Risks, and Limitations |
|
|
|
|
|
### Bias |
|
|
|
|
|
The model's predictions may reflect biases present in the training data, including any biases related to primates or sentiment labeling. |
|
|
|
|
|
### Risks |
|
|
|
|
|
- Misclassification: The model may misclassify sentiment due to ambiguity or complexity in the text. |
|
|
- Generalization: The model's performance may vary across different domains or datasets. |
|
|
|
|
|
### Limitations |
|
|
|
|
|
- Limited Domain: The model's effectiveness may be limited to text related to primates. |
|
|
- Cultural Bias: The model's performance may be influenced by cultural nuances present in the training data. |
|
|
|
|
|
## Recommendations |
|
|
|
|
|
Users should be cautious when interpreting the model's predictions, considering potential biases and limitations. Fine-tuning on domain-specific data or applying post-processing techniques may help mitigate biases and improve performance. |
|
|
|
|
|
## How to Get Started with the Model |
|
|
|
|
|
```python |
|
|
# Example code for using the sentiment analysis model |
|
|
|
|
|
# 1. Load the model and tokenizer |
|
|
from transformers import AutoTokenizer, AutoModelForSequenceClassification |
|
|
|
|
|
tokenizer = AutoTokenizer.from_pretrained("sbcBI/sentiment_analysis_model") |
|
|
model = AutoModelForSequenceClassification.from_pretrained("sbcBI/sentiment_analysis_model") |
|
|
|
|
|
# 2. Tokenize input text |
|
|
text = "Sample text for sentiment analysis" |
|
|
encoded_input = tokenizer(text, return_tensors='pt') |
|
|
|
|
|
# 3. Perform inference |
|
|
output = model(**encoded_input) |
|
|
predicted_label = output.logits.argmax().item() |
|
|
|
|
|
# 4. Interpret prediction |
|
|
sentiment_labels = ['Negative', 'Neutral', 'Positive'] |
|
|
print("Predicted Sentiment:", sentiment_labels[predicted_label]) |
|
|
``` |
|
|
|
|
|
## Training Details |
|
|
|
|
|
### Training Data |
|
|
|
|
|
The training data consists of posts related to primates, annotated with sentiment labels. |
|
|
|
|
|
### Training Procedure |
|
|
|
|
|
#### Preprocessing |
|
|
|
|
|
Text data underwent preprocessing steps including lowercase conversion, punctuation removal, tokenization, stopword removal, and stemming. |
|
|
|
|
|
#### Training Hyperparameters |
|
|
|
|
|
- **Training regime:** Fine-tuning of transformer-based pre-trained model |
|
|
- **Optimizer:** Adam optimizer |
|
|
- **Learning rate:** 5e-5 |
|
|
- **Batch size:** 8 |
|
|
- **Epochs:** 10 |
|
|
|
|
|
### Evaluation |
|
|
|
|
|
#### Testing Data, Factors & Metrics |
|
|
|
|
|
- **Testing Data:** Holdout test set |
|
|
- **Metrics:** Accuracy, Precision, Recall, F1-score |
|
|
|
|
|
#### Results |
|
|
|
|
|
- **Accuracy:** 0.79 |
|
|
- **Precision:** 0.74 |
|
|
- **Recall:** 0.77 |
|
|
- **F1-score:** 0.75 |
|
|
|
|
|
## Environmental Impact |
|
|
|
|
|
Carbon emissions were not directly measured for model training. However, users should consider the environmental impact of training and deploying machine learning models, especially on large-scale infrastructure. |
|
|
|
|
|
## Technical Specifications |
|
|
|
|
|
### Model Architecture and Objective |
|
|
|
|
|
The model architecture is based on transformer-based architectures, specifically designed for sequence classification tasks such as sentiment analysis. |
|
|
|
|
|
### Compute Infrastructure |
|
|
|
|
|
#### Software |
|
|
|
|
|
- **Framework:** PyTorch |
|
|
- **Dependencies:** Transformers, NLTK |
|
|
|