SetFit

This is a SetFit model that can be used for Text Classification. A NoneType instance is used for classification.

The model has been trained using an efficient few-shot learning technique that involves:

Fine-tuning a Sentence Transformer with contrastive learning.
Training a classification head with features from the fine-tuned Sentence Transformer.

Model Details

Model Description

Model Type: SetFit
Classification head: a NoneType instance
Maximum Sequence Length: 512 tokens
Number of Classes: 3 classes

Model Sources

Repository: SetFit on GitHub
Paper: Efficient Few-Shot Learning Without Prompts
Blogpost: SetFit: Efficient Few-Shot Learning Without Prompts

Model Labels

Label	Examples
general	"Even with the waiter's glowing recommendation, the pasta was just run-of-the-mill." 'It was an acceptable plate of pasta, just lacking that wow factor.' 'There was so much salt on the garlic bread it crunched.'
service	'So impressed by the bartender who remembered my vegan preferences and suggested perfect drinks accordingly.' 'The pace of the service was outrageously slow for a restaurant with no other guests.' 'We got the whole meal for free after telling the manager about the unacceptable wait time.'
dietary	'I ordered gluten-free bread and they brought out the regular kind by mistake.' 'My order went in so fast because the dairy-free options were impossible to miss on the menu.' "What really impressed me wasn't just the food, but how clearly they labeled the dairy-free items."

Uses

Direct Use for Inference

First install the SetFit library:

pip install setfit

Then you can load this model and run inference.

from setfit import SetFitModel

# Download from the 🤗 Hub
model = SetFitModel.from_pretrained("setfit_model_id")
# Run inference
preds = model("Way too loud to chat comfortably.")

Training Details

Training Set Metrics

Training set	Min	Median	Max
Word count	4	12.6582	28

Label	Training Sample Count
dietary	367
service	416
general	399

Training Hyperparameters

batch_size: (16, 16)
num_epochs: (1, 1)
max_steps: -1
sampling_strategy: oversampling
num_iterations: 20
body_learning_rate: (2e-05, 1e-05)
head_learning_rate: 0.01
loss: CosineSimilarityLoss
distance_metric: cosine_distance
margin: 0.25
end_to_end: False
use_amp: False
warmup_proportion: 0.1
l2_weight: 0.01
seed: 42
eval_max_steps: -1
load_best_model_at_end: False

Training Results

Epoch	Step	Training Loss	Validation Loss
0.0003	1	0.2153	-
0.0169	50	0.1885	-
0.0338	100	0.0788	-
0.0508	150	0.0191	-
0.0677	200	0.0100	-
0.0846	250	0.0057	-
0.1015	300	0.0033	-
0.1184	350	0.0024	-
0.1354	400	0.0020	-
0.1523	450	0.0018	-
0.1692	500	0.0016	-
0.1861	550	0.0016	-
0.2030	600	0.0015	-
0.2200	650	0.0014	-
0.2369	700	0.0014	-
0.2538	750	0.0013	-
0.2707	800	0.0012	-
0.2876	850	0.0012	-
0.3046	900	0.0011	-
0.3215	950	0.0011	-
0.3384	1000	0.0011	-
0.3553	1050	0.0011	-
0.3723	1100	0.0010	-
0.3892	1150	0.0010	-
0.4061	1200	0.0010	-
0.4230	1250	0.0009	-
0.4399	1300	0.0009	-
0.4569	1350	0.0009	-
0.4738	1400	0.0008	-
0.4907	1450	0.0009	-
0.5076	1500	0.0008	-
0.5245	1550	0.0009	-
0.5415	1600	0.0008	-
0.5584	1650	0.0008	-
0.5753	1700	0.0008	-
0.5922	1750	0.0008	-
0.6091	1800	0.0008	-
0.6261	1850	0.0008	-
0.6430	1900	0.0008	-
0.6599	1950	0.0007	-
0.6768	2000	0.0008	-
0.6937	2050	0.0008	-
0.7107	2100	0.0008	-
0.7276	2150	0.0007	-
0.7445	2200	0.0007	-
0.7614	2250	0.0007	-
0.7783	2300	0.0007	-
0.7953	2350	0.0007	-
0.8122	2400	0.0007	-
0.8291	2450	0.0007	-
0.8460	2500	0.0007	-
0.8629	2550	0.0007	-
0.8799	2600	0.0007	-
0.8968	2650	0.0007	-
0.9137	2700	0.0007	-
0.9306	2750	0.0007	-
0.9475	2800	0.0006	-
0.9645	2850	0.0007	-
0.9814	2900	0.0007	-
0.9983	2950	0.0007	-

Framework Versions

Python: 3.11.15
SetFit: 1.1.3
Sentence Transformers: 5.5.0
Transformers: 5.8.1
PyTorch: 2.11.0+cu130
Datasets: 4.8.5
Tokenizers: 0.22.2

Citation

BibTeX

@article{https://doi.org/10.48550/arxiv.2209.11055,
    doi = {10.48550/ARXIV.2209.11055},
    url = {https://arxiv.org/abs/2209.11055},
    author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren},
    keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
    title = {Efficient Few-Shot Learning Without Prompts},
    publisher = {arXiv},
    year = {2022},
    copyright = {Creative Commons Attribution 4.0 International}
}

Downloads last month: 2

Safetensors

Model size

0.1B params

Tensor type

F32

Paper for ryeyoo/sentimentizer-router

Efficient Few-Shot Learning Without Prompts

Paper • 2209.11055 • Published Sep 22, 2022 • 7