SetFit with intfloat/multilingual-e5-small

This is a SetFit model that can be used for Text Classification. This SetFit model uses intfloat/multilingual-e5-small as the Sentence Transformer embedding model. A SetFitHead instance is used for classification.

The model has been trained using an efficient few-shot learning technique that involves:

Fine-tuning a Sentence Transformer with contrastive learning.
Training a classification head with features from the fine-tuned Sentence Transformer.

Model Details

Model Description

Model Type: SetFit
Sentence Transformer body: intfloat/multilingual-e5-small
Classification head: a SetFitHead instance
Maximum Sequence Length: 512 tokens
Number of Classes: 2 classes

Model Sources

Repository: SetFit on GitHub
Paper: Efficient Few-Shot Learning Without Prompts
Blogpost: SetFit: Efficient Few-Shot Learning Without Prompts

Model Labels

Label	Examples
0	'query: Értem. Mit csinálunk most?' 'query: Ola Luca, que tal? Rematache o traballo?' 'query: Lijepo je. Hvala.'
1	'query: Жөнейін, кейін кездесеміз.' 'query: Така, ќе се видиме повторно.' 'query: ठीक है बाद में बात करते हैं मार्क अच्छा दिन'

Evaluation

Metrics

Label	Accuracy
all	0.9333

Uses

Direct Use for Inference

First install the SetFit library:

pip install setfit

Then you can load this model and run inference.

from setfit import SetFitModel

# Download from the 🤗 Hub
model = SetFitModel.from_pretrained("setfit_model_id")
# Run inference
preds = model("query: Tôi xin lỗi nhưng tôi phải đi")

Training Details

Training Set Metrics

Training set	Min	Median	Max
Word count	2	7.2168	25

Label	Training Sample Count
0	346
1	346

Training Hyperparameters

batch_size: (16, 2)
num_epochs: (1, 16)
max_steps: 2500
sampling_strategy: undersampling
body_learning_rate: (1e-06, 1e-06)
head_learning_rate: 0.001
loss: CosineSimilarityLoss
distance_metric: cosine_distance
margin: 0.25
end_to_end: False
use_amp: False
warmup_proportion: 0.1
seed: 42
run_name: multilingual-e5-small
eval_max_steps: -1
load_best_model_at_end: False

Training Results

Epoch	Step	Training Loss	Validation Loss
0.0002	1	0.3607	-
0.0100	50	0.3634	0.3452
0.0200	100	0.3493	0.3377
0.0300	150	0.3244	0.3234
0.0400	200	0.3244	0.3034
0.0500	250	0.2931	0.2731
0.0600	300	0.2471	0.2398
0.0700	350	0.237	0.2168
0.0800	400	0.1964	0.2082
0.0900	450	0.2319	0.198
0.1000	500	0.2003	0.1968
0.1100	550	0.2014	0.1968
0.1200	600	0.1617	0.1879
0.1300	650	0.2214	0.1798
0.1400	700	0.2498	0.1768
0.1500	750	0.1527	0.1764
0.1600	800	0.1134	0.1733
0.1700	850	0.1393	0.1614
0.1800	900	0.1052	0.1549
0.1900	950	0.1772	0.149
0.2000	1000	0.1065	0.1504
0.2100	1050	0.087	0.1392
0.2200	1100	0.1416	0.1333
0.2300	1150	0.0767	0.1279
0.2400	1200	0.1228	0.1243
0.2500	1250	0.099	0.1128
0.2599	1300	0.1125	0.1106
0.2699	1350	0.1012	0.1156
0.2799	1400	0.0343	0.1022
0.2899	1450	0.0814	0.1012
0.2999	1500	0.0947	0.0965
0.3099	1550	0.0799	0.0964
0.3199	1600	0.113	0.0942
0.3299	1650	0.1125	0.0917
0.3399	1700	0.0507	0.0899
0.3499	1750	0.0986	0.0938
0.3599	1800	0.0885	0.0913
0.3699	1850	0.0712	0.0841
0.3799	1900	0.1131	0.0851
0.3899	1950	0.0701	0.0852
0.3999	2000	0.0805	0.0878
0.4099	2050	0.0375	0.0814
0.4199	2100	0.1236	0.0797
0.4299	2150	0.0532	0.0881
0.4399	2200	0.0265	0.0806
0.4499	2250	0.1268	0.0801
0.4599	2300	0.0557	0.0797
0.4699	2350	0.0956	0.0832
0.4799	2400	0.0671	0.081
0.4899	2450	0.1394	0.0794
0.4999	2500	0.1165	0.0798

Framework Versions

Python: 3.10.11
SetFit: 1.0.3
Sentence Transformers: 2.7.0
Transformers: 4.39.3
PyTorch: 2.4.0
Datasets: 2.20.0
Tokenizers: 0.15.2

Citation

BibTeX

@article{https://doi.org/10.48550/arxiv.2209.11055,
    doi = {10.48550/ARXIV.2209.11055},
    url = {https://arxiv.org/abs/2209.11055},
    author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren},
    keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
    title = {Efficient Few-Shot Learning Without Prompts},
    publisher = {arXiv},
    year = {2022},
    copyright = {Creative Commons Attribution 4.0 International}
}

Downloads last month: 3

Safetensors

Model size

0.1B params

Tensor type

F32

Model tree for thegenerativegeneration/stay_or_go_conversation_classifier_s_v2

Base model

intfloat/multilingual-e5-small

Finetuned

(182)

this model

Paper for thegenerativegeneration/stay_or_go_conversation_classifier_s_v2

Efficient Few-Shot Learning Without Prompts

Paper • 2209.11055 • Published Sep 22, 2022 • 7

Evaluation results

Accuracy on Unknown
test set self-reported

0.933