File size: 4,232 Bytes

ce13b0a
 
d626dd3
 
 
 
 
 
 
 
 
ce13b0a
d626dd3

---
license: mit
language:
- en
metrics:
- accuracy
- precision
- recall
- f1
tags:
- code
---

# Model Card for Sentiment Analysis on Primate Dataset

This model card provides details about a sentiment analysis model trained on a dataset containing posts related to primates. The model predicts sentiment labels for textual data using transformer-based architectures.

## Model Details

### Model Description

The sentiment analysis model aims to classify text data into sentiment categories such as positive, negative, or neutral. It utilizes transformer-based architectures for sequence classification.

- **Developed by:** Jaskaran Singh
- **Model type:** Transformer-based sentiment analysis model
- **Language(s) (NLP):** English
- **License:** MIT
- **Finetuned from model:** Transformer-based pre-trained model

### Model Sources

- **Repository:** https://github.com/JaskaranSingh-01/Sentiment_Analyzer
- **Demo:** https://sentimentanalyzer-f76oxwautwypxpea4lj3wg.streamlit.app/

## Uses

### Direct Use

The model can be directly used for sentiment analysis tasks, particularly on textual data related to primates.

### Downstream Use

The model can be fine-tuned for specific downstream tasks or integrated into larger applications requiring sentiment analysis functionality.

## Bias, Risks, and Limitations

### Bias

The model's predictions may reflect biases present in the training data, including any biases related to primates or sentiment labeling.

### Risks

- Misclassification: The model may misclassify sentiment due to ambiguity or complexity in the text.
- Generalization: The model's performance may vary across different domains or datasets.

### Limitations

- Limited Domain: The model's effectiveness may be limited to text related to primates.
- Cultural Bias: The model's performance may be influenced by cultural nuances present in the training data.

## Recommendations

Users should be cautious when interpreting the model's predictions, considering potential biases and limitations. Fine-tuning on domain-specific data or applying post-processing techniques may help mitigate biases and improve performance.

## How to Get Started with the Model

```python
# Example code for using the sentiment analysis model

# 1. Load the model and tokenizer
from transformers import AutoTokenizer, AutoModelForSequenceClassification

tokenizer = AutoTokenizer.from_pretrained("sbcBI/sentiment_analysis_model")
model = AutoModelForSequenceClassification.from_pretrained("sbcBI/sentiment_analysis_model")

# 2. Tokenize input text
text = "Sample text for sentiment analysis"
encoded_input = tokenizer(text, return_tensors='pt')

# 3. Perform inference
output = model(**encoded_input)
predicted_label = output.logits.argmax().item()

# 4. Interpret prediction
sentiment_labels = ['Negative', 'Neutral', 'Positive']
print("Predicted Sentiment:", sentiment_labels[predicted_label])
```

## Training Details

### Training Data

The training data consists of posts related to primates, annotated with sentiment labels.

### Training Procedure

#### Preprocessing

Text data underwent preprocessing steps including lowercase conversion, punctuation removal, tokenization, stopword removal, and stemming.

#### Training Hyperparameters

- **Training regime:** Fine-tuning of transformer-based pre-trained model
- **Optimizer:** Adam optimizer
- **Learning rate:** 5e-5
- **Batch size:** 8
- **Epochs:** 10

### Evaluation

#### Testing Data, Factors & Metrics

- **Testing Data:** Holdout test set
- **Metrics:** Accuracy, Precision, Recall, F1-score

#### Results

- **Accuracy:** 0.79
- **Precision:** 0.74
- **Recall:** 0.77
- **F1-score:** 0.75

## Environmental Impact

Carbon emissions were not directly measured for model training. However, users should consider the environmental impact of training and deploying machine learning models, especially on large-scale infrastructure.

## Technical Specifications

### Model Architecture and Objective

The model architecture is based on transformer-based architectures, specifically designed for sequence classification tasks such as sentiment analysis.

### Compute Infrastructure

#### Software

- **Framework:** PyTorch
- **Dependencies:** Transformers, NLTK