metadata
language:
- en
license: apache-2.0
library_name: setfit
tags:
- setfit
- sentence-transformers
- text-classification
- sentiment-analysis
- few-shot-learning
pipeline_tag: text-classification
metrics:
- accuracy
- f1
- precision
- recall
model-index:
- name: SetFit Sentiment Analysis
results:
- task:
type: text-classification
name: Sentiment Analysis
metrics:
- name: Accuracy
type: accuracy
value: 0.9
- name: F1 (Weighted)
type: f1
value: 0.8984430773904458
- name: Precision (Weighted)
type: precision
value: 0.9060606060606061
- name: Recall (Weighted)
type: recall
value: 0.9
SetFit Sentiment Analysis Model
This is a SetFit model fine-tuned for sentiment classification on customer feedback data.
Model Description
| Property | Value |
|---|---|
| Base Model | BAAI/bge-base-en-v1.5 |
| Total Parameters | 109,482,240 |
| Trainable Parameters | 109,482,240 |
| Body Parameters | 109,482,240 |
| Head Parameters | 0 |
| Model Size | 417.64 MB |
| Labels | [0, 1, 2, 3, 4] |
| Number of Classes | 5 |
| Serialization | safetensors |
Training Configuration
| Parameter | Value |
|---|---|
| Batch Size | 4 |
| Epochs | [1, 16] |
| Training Samples | 540 |
| Test Samples | 100 |
| Loss Function | CosineSimilarityLoss |
| Metric for Best Model | embedding_loss |
Training Progress
- Initial Loss: 0.1474
- Final Loss: 0.0648
- Eval Loss: 0.0918
- Training Runtime: 2943.9747 seconds
- Samples/Second: 3.6690
Evaluation Results
| Metric | Score |
|---|---|
| Accuracy | 0.9000 |
| F1 (Weighted) | 0.8984 |
| F1 (Macro) | 0.8984 |
| Precision (Weighted) | 0.9061 |
| Precision (Macro) | 0.9061 |
| Recall (Weighted) | 0.9000 |
| Recall (Macro) | 0.9000 |
Per-Class Performance
precision recall f1-score support
0 0.86 0.95 0.90 20
1 0.83 0.75 0.79 20
2 0.83 1.00 0.91 20
3 1.00 0.80 0.89 20
4 1.00 1.00 1.00 20
accuracy 0.90 100
macro avg 0.91 0.90 0.90 100
weighted avg 0.91 0.90 0.90 100
Visualizations
Evaluation Metrics Overview
Confusion Matrix
Training Loss Curve
Learning Rate Schedule
Usage
from setfit import SetFitModel
# Load the model
model = SetFitModel.from_pretrained("loganh274/nlp-testing-setfit")
# Single prediction
text = "This product exceeded my expectations!"
prediction = model.predict([text])
print(f"Sentiment: {prediction[0]}")
# Batch prediction
texts = [
"Amazing quality, highly recommend!",
"It's okay, nothing special.",
"Terrible experience, very disappointed.",
]
predictions = model.predict(texts)
probabilities = model.predict_proba(texts)
for text, pred, prob in zip(texts, predictions, probabilities):
print(f"Text: {text}")
print(f" Prediction: {pred}, Confidence: {max(prob):.2%}")
Label Mapping
| Label | Sentiment |
|---|---|
| 0 | Negative |
| 1 | Somewhat Negative |
| 2 | Neutral |
| 3 | Somewhat Positive |
| 4 | Positive |
Environment
| Package | Version |
|---|---|
| Python | 3.11.14 |
| SetFit | 1.1.3 |
| PyTorch | 2.9.1 |
| scikit-learn | 1.8.0 |
| Transformers | N/A |
Citation
If you use this model, please cite the SetFit paper:
@article{tunstall2022efficient,
title={Efficient Few-Shot Learning Without Prompts},
author={Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren},
journal={arXiv preprint arXiv:2209.11055},
year={2022}
}
License
Apache 2.0