setfit / README.md
abehandler's picture
Push model using huggingface_hub.
253f5d5 verified
|
raw
history blame
13.7 kB
metadata
library_name: setfit
tags:
  - setfit
  - sentence-transformers
  - text-classification
  - generated_from_setfit_trainer
base_model: BAAI/bge-small-en-v1.5
metrics:
  - accuracy
widget:
  - text: approach affects entrepreneurship intention
  - text: innovation affects m & a success
  - text: total retail sales affects m & a success
  - text: >-
      stimulation of the sales staff in business organization affects
      entrepreneurship intention
  - text: country-level economy affects ceo pay
pipeline_tag: text-classification
inference: true
model-index:
  - name: SetFit with BAAI/bge-small-en-v1.5
    results:
      - task:
          type: text-classification
          name: Text Classification
        dataset:
          name: Unknown
          type: unknown
          split: test
        metrics:
          - type: accuracy
            value: 0.8117647058823529
            name: Accuracy

SetFit with BAAI/bge-small-en-v1.5

This is a SetFit model that can be used for Text Classification. This SetFit model uses BAAI/bge-small-en-v1.5 as the Sentence Transformer embedding model. A LogisticRegression instance is used for classification.

The model has been trained using an efficient few-shot learning technique that involves:

  1. Fine-tuning a Sentence Transformer with contrastive learning.
  2. Training a classification head with features from the fine-tuned Sentence Transformer.

Model Details

Model Description

Model Sources

Model Labels

Label Examples
0
  • 'oecd affects ceo pay'
  • 'marketing and sales affects entrepreneurship intention'
  • 'australian research affects entrepreneurship intention'
1
  • 'academic performance affects entrepreneurship intention'
  • 'collectivism affects entrepreneurship intention'
  • 'responsibility affects ceo pay'

Evaluation

Metrics

Label Accuracy
all 0.8118

Uses

Direct Use for Inference

First install the SetFit library:

pip install setfit

Then you can load this model and run inference.

from setfit import SetFitModel

# Download from the 🤗 Hub
model = SetFitModel.from_pretrained("abehandlerorg/setfit")
# Run inference
preds = model("innovation affects m & a success")

Training Details

Training Set Metrics

Training set Min Median Max
Word count 4 5.4661 13
Label Training Sample Count
0 164
1 175

Training Hyperparameters

  • batch_size: (32, 32)
  • num_epochs: (4, 4)
  • max_steps: -1
  • sampling_strategy: oversampling
  • body_learning_rate: (2e-05, 1e-05)
  • head_learning_rate: 0.01
  • loss: CosineSimilarityLoss
  • distance_metric: cosine_distance
  • margin: 0.25
  • end_to_end: False
  • use_amp: False
  • warmup_proportion: 0.1
  • seed: 42
  • eval_max_steps: -1
  • load_best_model_at_end: False

Training Results

Epoch Step Training Loss Validation Loss
0.0006 1 0.303 -
0.0276 50 0.2825 -
0.0553 100 0.2567 -
0.0829 150 0.2345 -
0.1106 200 0.2347 -
0.1382 250 0.1693 -
0.1658 300 0.0862 -
0.1935 350 0.0184 -
0.2211 400 0.0042 -
0.2488 450 0.0042 -
0.2764 500 0.0263 -
0.3040 550 0.0019 -
0.3317 600 0.0058 -
0.3593 650 0.0095 -
0.3870 700 0.0011 -
0.4146 750 0.0012 -
0.4422 800 0.0009 -
0.4699 850 0.0011 -
0.4975 900 0.001 -
0.5252 950 0.0215 -
0.5528 1000 0.0024 -
0.5804 1050 0.0034 -
0.6081 1100 0.0008 -
0.6357 1150 0.0161 -
0.6633 1200 0.0132 -
0.6910 1250 0.0009 -
0.7186 1300 0.0073 -
0.7463 1350 0.0089 -
0.7739 1400 0.0166 -
0.8015 1450 0.0005 -
0.8292 1500 0.0005 -
0.8568 1550 0.0006 -
0.8845 1600 0.0098 -
0.9121 1650 0.0005 -
0.9397 1700 0.0005 -
0.9674 1750 0.0263 -
0.9950 1800 0.0006 -
1.0227 1850 0.0005 -
1.0503 1900 0.0089 -
1.0779 1950 0.0074 -
1.1056 2000 0.0057 -
1.1332 2050 0.0006 -
1.1609 2100 0.0004 -
1.1885 2150 0.0004 -
1.2161 2200 0.0006 -
1.2438 2250 0.0005 -
1.2714 2300 0.0004 -
1.2991 2350 0.0088 -
1.3267 2400 0.0004 -
1.3543 2450 0.0005 -
1.3820 2500 0.0004 -
1.4096 2550 0.0118 -
1.4373 2600 0.0004 -
1.4649 2650 0.0149 -
1.4925 2700 0.0004 -
1.5202 2750 0.0004 -
1.5478 2800 0.0003 -
1.5755 2850 0.0004 -
1.6031 2900 0.0004 -
1.6307 2950 0.0136 -
1.6584 3000 0.0083 -
1.6860 3050 0.0094 -
1.7137 3100 0.0088 -
1.7413 3150 0.0004 -
1.7689 3200 0.0003 -
1.7966 3250 0.0004 -
1.8242 3300 0.0004 -
1.8519 3350 0.0101 -
1.8795 3400 0.0112 -
1.9071 3450 0.0003 -
1.9348 3500 0.0117 -
1.9624 3550 0.0003 -
1.9900 3600 0.0003 -
2.0177 3650 0.0003 -
2.0453 3700 0.0083 -
2.0730 3750 0.0003 -
2.1006 3800 0.0132 -
2.1282 3850 0.0003 -
2.1559 3900 0.0003 -
2.1835 3950 0.0003 -
2.2112 4000 0.0004 -
2.2388 4050 0.0003 -
2.2664 4100 0.0003 -
2.2941 4150 0.0003 -
2.3217 4200 0.0003 -
2.3494 4250 0.0003 -
2.3770 4300 0.0079 -
2.4046 4350 0.0003 -
2.4323 4400 0.0003 -
2.4599 4450 0.0003 -
2.4876 4500 0.0057 -
2.5152 4550 0.0003 -
2.5428 4600 0.0003 -
2.5705 4650 0.0003 -
2.5981 4700 0.0003 -
2.6258 4750 0.0003 -
2.6534 4800 0.0003 -
2.6810 4850 0.0003 -
2.7087 4900 0.0003 -
2.7363 4950 0.0003 -
2.7640 5000 0.0019 -
2.7916 5050 0.0157 -
2.8192 5100 0.0003 -
2.8469 5150 0.0098 -
2.8745 5200 0.0003 -
2.9022 5250 0.0117 -
2.9298 5300 0.0003 -
2.9574 5350 0.0003 -
2.9851 5400 0.0087 -
3.0127 5450 0.0002 -
3.0404 5500 0.0003 -
3.0680 5550 0.0085 -
3.0956 5600 0.0159 -
3.1233 5650 0.0003 -
3.1509 5700 0.0053 -
3.1786 5750 0.0003 -
3.2062 5800 0.0086 -
3.2338 5850 0.0002 -
3.2615 5900 0.0003 -
3.2891 5950 0.0055 -
3.3167 6000 0.0002 -
3.3444 6050 0.0092 -
3.3720 6100 0.0153 -
3.3997 6150 0.0002 -
3.4273 6200 0.0002 -
3.4549 6250 0.0002 -
3.4826 6300 0.0003 -
3.5102 6350 0.0101 -
3.5379 6400 0.0003 -
3.5655 6450 0.0003 -
3.5931 6500 0.0091 -
3.6208 6550 0.0002 -
3.6484 6600 0.0085 -
3.6761 6650 0.0003 -
3.7037 6700 0.0002 -
3.7313 6750 0.0002 -
3.7590 6800 0.0068 -
3.7866 6850 0.0003 -
3.8143 6900 0.0079 -
3.8419 6950 0.0175 -
3.8695 7000 0.0066 -
3.8972 7050 0.0003 -
3.9248 7100 0.0002 -
3.9525 7150 0.0065 -
3.9801 7200 0.0094 -

Framework Versions

  • Python: 3.10.12
  • SetFit: 1.0.3
  • Sentence Transformers: 2.7.0
  • Transformers: 4.40.1
  • PyTorch: 2.2.1+cu121
  • Datasets: 2.19.1
  • Tokenizers: 0.19.1

Citation

BibTeX

@article{https://doi.org/10.48550/arxiv.2209.11055,
    doi = {10.48550/ARXIV.2209.11055},
    url = {https://arxiv.org/abs/2209.11055},
    author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren},
    keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
    title = {Efficient Few-Shot Learning Without Prompts},
    publisher = {arXiv},
    year = {2022},
    copyright = {Creative Commons Attribution 4.0 International}
}