File size: 4,443 Bytes
48c2549 332efbc 48c2549 332efbc 48c2549 332efbc 48c2549 332efbc 48c2549 332efbc 2f75066 332efbc 68cd1bd 332efbc f08b957 332efbc 48c2549 332efbc 48c2549 332efbc 48c2549 332efbc 48c2549 332efbc 48c2549 332efbc 48c2549 332efbc 48c2549 332efbc 48c2549 332efbc 48c2549 332efbc 2f75066 332efbc 68cd1bd 332efbc f08b957 332efbc 48c2549 332efbc 48c2549 332efbc 48c2549 332efbc | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 | ---
language:
- en
license: apache-2.0
library_name: setfit
tags:
- setfit
- sentence-transformers
- text-classification
- sentiment-analysis
- few-shot-learning
pipeline_tag: text-classification
metrics:
- accuracy
- f1
- precision
- recall
model-index:
- name: SetFit Sentiment Analysis
results:
- task:
type: text-classification
name: Sentiment Analysis
metrics:
- name: Accuracy
type: accuracy
value: 0.88
- name: F1 (Weighted)
type: f1
value: 0.8805050505050506
- name: Precision (Weighted)
type: precision
value: 0.8883333333333333
- name: Recall (Weighted)
type: recall
value: 0.88
---
# SetFit Sentiment Analysis Model
This is a [SetFit](https://github.com/huggingface/setfit) model fine-tuned for sentiment classification on customer feedback data.
## Model Description
| Property | Value |
|----------|-------|
| **Base Model** | [BAAI/bge-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) |
| **Total Parameters** | 109,482,240 |
| **Trainable Parameters** | 109,482,240 |
| **Body Parameters** | 109,482,240 |
| **Head Parameters** | 0 |
| **Model Size** | 417.64 MB |
| **Labels** | [0, 1, 2, 3, 4] |
| **Number of Classes** | 5 |
| **Serialization** | safetensors |
## Training Configuration
| Parameter | Value |
|-----------|-------|
| **Batch Size** | 16 |
| **Epochs** | [1, 16] |
| **Training Samples** | 540 |
| **Test Samples** | 100 |
| **Loss Function** | CosineSimilarityLoss |
| **Metric for Best Model** | embedding_loss |
### Training Progress
- **Initial Loss:** 0.2366
- **Final Loss:** 0.0893
- **Eval Loss:** 0.0984
- **Training Runtime:** 800.2981 seconds
- **Samples/Second:** 13.4950
## Evaluation Results
| Metric | Score |
|--------|-------|
| **Accuracy** | 0.8800 |
| **F1 (Weighted)** | 0.8805 |
| **F1 (Macro)** | 0.8805 |
| **Precision (Weighted)** | 0.8883 |
| **Precision (Macro)** | 0.8883 |
| **Recall (Weighted)** | 0.8800 |
| **Recall (Macro)** | 0.8800 |
### Per-Class Performance
```
precision recall f1-score support
0 0.90 0.90 0.90 20
1 0.75 0.75 0.75 20
2 0.79 0.95 0.86 20
3 1.00 0.80 0.89 20
4 1.00 1.00 1.00 20
accuracy 0.88 100
macro avg 0.89 0.88 0.88 100
weighted avg 0.89 0.88 0.88 100
```
## Visualizations
### Evaluation Metrics Overview
<p align="center">
<img src="evaluation_metrics.png" alt="Evaluation Metrics" width="800"/>
</p>
### Confusion Matrix
<p align="center">
<img src="confusion_matrix.png" alt="Confusion Matrix" width="600"/>
</p>
### Training Loss Curve
<p align="center">
<img src="loss_curve.png" alt="Training Loss Curve" width="600"/>
</p>
### Learning Rate Schedule
<p align="center">
<img src="learning_rate.png" alt="Learning Rate Schedule" width="600"/>
</p>
## Usage
```python
from setfit import SetFitModel
# Load the model
model = SetFitModel.from_pretrained("loganh274/nlp-testing-setfit")
# Single prediction
text = "This product exceeded my expectations!"
prediction = model.predict([text])
print(f"Sentiment: {prediction[0]}")
# Batch prediction
texts = [
"Amazing quality, highly recommend!",
"It's okay, nothing special.",
"Terrible experience, very disappointed.",
]
predictions = model.predict(texts)
probabilities = model.predict_proba(texts)
for text, pred, prob in zip(texts, predictions, probabilities):
print(f"Text: {text}")
print(f" Prediction: {pred}, Confidence: {max(prob):.2%}")
```
## Label Mapping
| Label | Sentiment |
|-------|-----------|
| 0 | Negative |
| 1 | Somewhat Negative |
| 2 | Neutral |
| 3 | Somewhat Positive |
| 4 | Positive |
## Environment
| Package | Version |
|---------|---------|
| Python | 3.11.14 |
| SetFit | 1.1.3 |
| PyTorch | 2.9.1 |
| scikit-learn | 1.8.0 |
| Transformers | N/A |
## Citation
If you use this model, please cite the SetFit paper:
```bibtex
@article{tunstall2022efficient,
title={Efficient Few-Shot Learning Without Prompts},
author={Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren},
journal={arXiv preprint arXiv:2209.11055},
year={2022}
}
```
## License
Apache 2.0
|