SetFit with sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2

This is a SetFit model that can be used for Text Classification. This SetFit model uses sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2 as the Sentence Transformer embedding model. A LogisticRegression instance is used for classification.

The model has been trained using an efficient few-shot learning technique that involves:

Fine-tuning a Sentence Transformer with contrastive learning.
Training a classification head with features from the fine-tuned Sentence Transformer.

Model Details

Model Description

Model Type: SetFit
Sentence Transformer body: sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2
Classification head: a LogisticRegression instance
Maximum Sequence Length: 128 tokens
Number of Classes: 8 classes

Model Sources

Repository: SetFit on GitHub
Paper: Efficient Few-Shot Learning Without Prompts
Blogpost: SetFit: Efficient Few-Shot Learning Without Prompts

Model Labels

Label	Examples
Groceries	'Mleko 3.2% Łaciate' 'MLEKO UHT 1,5%' 'Chleb wiejski krojony'
Alcohol and stimulants	'Piwo Tyskie 0.5L' 'PIWO ZYWIEC PUSZKA' 'PIWO DESPERADOS 4PAK'
Household and chemistry	'Domestos 1L' 'WC KRET ZEL' 'Papier toaletowy 8 rolek'
Cosmetics	'Szampon Head&Shoulders' 'SZAMPON DO WLOSOW' 'Żel pod prysznic Nivea'
Entertainment	'Bilet do kina' 'BILET NORMALNY 2D' 'Gra na PS5 FIFA'
Taxes and fees	'Opłata recyklingowa' 'OPL. RECYKLINGOWA' 'Koszt dostawy'
Transport	'Bilet autobusowy 20min' 'BILET MPK ULGOWY' 'Bilet tramwajowy'
Other	'Torba foliowa' 'REKLAMOWKA MALA' 'TORBA PAPIEROWA DUZA'

Uses

Direct Use for Inference

First install the SetFit library:

pip install setfit

Then you can load this model and run inference.

from setfit import SetFitModel

# Download from the 🤗 Hub
model = SetFitModel.from_pretrained("Johnyyy123/smart-receipt-categorizer-v1")
# Run inference
preds = model("Kebab w bułce")

Training Details

Training Set Metrics

Training set	Min	Median	Max
Word count	1	2.6271	4

Label	Training Sample Count
Alcohol and stimulants	23
Cosmetics	20
Entertainment	17
Groceries	33
Household and chemistry	23
Other	29
Taxes and fees	14
Transport	18

Training Hyperparameters

batch_size: (16, 16)
num_epochs: (2, 2)
max_steps: -1
sampling_strategy: oversampling
num_iterations: 40
body_learning_rate: (2e-05, 2e-05)
head_learning_rate: 2e-05
loss: CosineSimilarityLoss
distance_metric: cosine_distance
margin: 0.25
end_to_end: False
use_amp: False
warmup_proportion: 0.1
l2_weight: 0.01
seed: 42
eval_max_steps: -1
load_best_model_at_end: False

Training Results

Epoch	Step	Training Loss	Validation Loss
0.0011	1	0.4056	-
0.0565	50	0.2709	-
0.1130	100	0.2476	-
0.1695	150	0.2203	-
0.2260	200	0.1902	-
0.2825	250	0.1536	-
0.3390	300	0.1149	-
0.3955	350	0.0803	-
0.4520	400	0.0546	-
0.5085	450	0.0329	-
0.5650	500	0.0186	-
0.6215	550	0.008	-
0.6780	600	0.0032	-
0.7345	650	0.0025	-
0.7910	700	0.002	-
0.8475	750	0.0012	-
0.9040	800	0.0013	-
0.9605	850	0.0011	-
1.0169	900	0.001	-
1.0734	950	0.0009	-
1.1299	1000	0.0008	-
1.1864	1050	0.0007	-
1.2429	1100	0.0007	-
1.2994	1150	0.0007	-
1.3559	1200	0.0006	-
1.4124	1250	0.0005	-
1.4689	1300	0.0005	-
1.5254	1350	0.0006	-
1.5819	1400	0.0005	-
1.6384	1450	0.0005	-
1.6949	1500	0.0005	-
1.7514	1550	0.0005	-
1.8079	1600	0.0005	-
1.8644	1650	0.0005	-
1.9209	1700	0.0004	-
1.9774	1750	0.0004	-

Framework Versions

Python: 3.10.15
SetFit: 1.1.3
Sentence Transformers: 5.1.2
Transformers: 4.57.3
PyTorch: 2.9.1+cu128
Datasets: 4.4.1
Tokenizers: 0.22.1

Citation

BibTeX

@article{https://doi.org/10.48550/arxiv.2209.11055,
    doi = {10.48550/ARXIV.2209.11055},
    url = {https://arxiv.org/abs/2209.11055},
    author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren},
    keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
    title = {Efficient Few-Shot Learning Without Prompts},
    publisher = {arXiv},
    year = {2022},
    copyright = {Creative Commons Attribution 4.0 International}
}

Downloads last month: 2

Safetensors

Model size

0.1B params

Tensor type

F32

Model tree for Johnyyy123/smart-receipt-categorizer-v1

Base model

sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2

Finetuned

(320)

this model

Paper for Johnyyy123/smart-receipt-categorizer-v1

Efficient Few-Shot Learning Without Prompts

Paper • 2209.11055 • Published Sep 22, 2022 • 7