nlp-testing-setfit / README.md

loganh274

Upload README.md with huggingface_hub

2777e79 verified 6 days ago

preview code

raw

history blame contribute delete

4.44 kB

metadata

language:
  - en
license: apache-2.0
library_name: setfit
tags:
  - setfit
  - sentence-transformers
  - text-classification
  - sentiment-analysis
  - few-shot-learning
pipeline_tag: text-classification
metrics:
  - accuracy
  - f1
  - precision
  - recall
model-index:
  - name: SetFit Sentiment Analysis
    results:
      - task:
          type: text-classification
          name: Sentiment Analysis
        metrics:
          - name: Accuracy
            type: accuracy
            value: 0.9
          - name: F1 (Weighted)
            type: f1
            value: 0.8984430773904458
          - name: Precision (Weighted)
            type: precision
            value: 0.9060606060606061
          - name: Recall (Weighted)
            type: recall
            value: 0.9

SetFit Sentiment Analysis Model

This is a SetFit model fine-tuned for sentiment classification on customer feedback data.

Model Description

Property	Value
Base Model	BAAI/bge-base-en-v1.5
Total Parameters	109,482,240
Trainable Parameters	109,482,240
Body Parameters	109,482,240
Head Parameters	0
Model Size	417.64 MB
Labels	[0, 1, 2, 3, 4]
Number of Classes	5
Serialization	safetensors

Training Configuration

Parameter	Value
Batch Size	4
Epochs	[1, 16]
Training Samples	540
Test Samples	100
Loss Function	CosineSimilarityLoss
Metric for Best Model	embedding_loss

Training Progress

Initial Loss: 0.1474
Final Loss: 0.0648
Eval Loss: 0.0918
Training Runtime: 2943.9747 seconds
Samples/Second: 3.6690

Evaluation Results

Metric	Score
Accuracy	0.9000
F1 (Weighted)	0.8984
F1 (Macro)	0.8984
Precision (Weighted)	0.9061
Precision (Macro)	0.9061
Recall (Weighted)	0.9000
Recall (Macro)	0.9000

Per-Class Performance

              precision    recall  f1-score   support

           0       0.86      0.95      0.90        20
           1       0.83      0.75      0.79        20
           2       0.83      1.00      0.91        20
           3       1.00      0.80      0.89        20
           4       1.00      1.00      1.00        20

    accuracy                           0.90       100
   macro avg       0.91      0.90      0.90       100
weighted avg       0.91      0.90      0.90       100

Visualizations

Evaluation Metrics Overview

Evaluation Metrics

Confusion Matrix

Training Loss Curve

Learning Rate Schedule

Usage

from setfit import SetFitModel

# Load the model
model = SetFitModel.from_pretrained("loganh274/nlp-testing-setfit")

# Single prediction
text = "This product exceeded my expectations!"
prediction = model.predict([text])
print(f"Sentiment: {prediction[0]}")

# Batch prediction
texts = [
    "Amazing quality, highly recommend!",
    "It's okay, nothing special.",
    "Terrible experience, very disappointed.",
]
predictions = model.predict(texts)
probabilities = model.predict_proba(texts)

for text, pred, prob in zip(texts, predictions, probabilities):
    print(f"Text: {text}")
    print(f"  Prediction: {pred}, Confidence: {max(prob):.2%}")

Label Mapping

Label	Sentiment
0	Negative
1	Somewhat Negative
2	Neutral
3	Somewhat Positive
4	Positive

Environment

Package	Version
Python	3.11.14
SetFit	1.1.3
PyTorch	2.9.1
scikit-learn	1.8.0
Transformers	N/A

Citation

If you use this model, please cite the SetFit paper:

@article{tunstall2022efficient,
  title={Efficient Few-Shot Learning Without Prompts},
  author={Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren},
  journal={arXiv preprint arXiv:2209.11055},
  year={2022}
}

License

Apache 2.0