SetFit with mini1013/master_domain

This is a SetFit model that can be used for Text Classification. This SetFit model uses mini1013/master_domain as the Sentence Transformer embedding model. A LogisticRegression instance is used for classification.

The model has been trained using an efficient few-shot learning technique that involves:

Fine-tuning a Sentence Transformer with contrastive learning.
Training a classification head with features from the fine-tuned Sentence Transformer.

Model Details

Model Description

Model Type: SetFit
Sentence Transformer body: mini1013/master_domain
Classification head: a LogisticRegression instance
Maximum Sequence Length: 512 tokens
Number of Classes: 9 classes

Model Sources

Repository: SetFit on GitHub
Paper: Efficient Few-Shot Learning Without Prompts
Blogpost: SetFit: Efficient Few-Shot Learning Without Prompts

Model Labels

Label	Examples
6.0	'본죽 미니장조림 2박스 70gx5개입x2 셜크' '[본죽]쇠고기 장조림 300g (냉장 소고기 반찬 점심 저녁 도시락 어린이 아기반찬) 순수본 주식회사' '본죽 쇠고기 장조림 170g x 4 5. 비비고 육개장 500g x 5개 감성주머니'
1.0	'일가집 일미 쫄깃 치자 단무지 1kg 두부 날치알 피클 일가집 일미 고추지 1kg 고추절임 고추장아찌 머치바잉' '일가집 일미 쫄깃 치자 단무지 1kg 두부 날치알 피클 일가집 일미 깐마늘 1kg 양파 다진마늘 청양 머치바잉' '참 맛좋은 하진 반달 단무지 2.5kg 농업회사법인 봉농주식회사'
5.0	'진 명이나물(실속형) 10kg 대용량 업소용 식당 반찬 장아찌 05 유림 명이나물 10kg (유) 협동맛사랑식품' '단풍콩잎 500g 양념 장아찌 국내제조 콩잎김치 삭힌 국산 갈치속젓 500g 사계절반찬' '군산 울외장아찌 2kg 나라즈케 나라스케 술지게미 2.무 장아찌 2kg 주식회사 백년부엌'
2.0	'마늘쫑무침 4kg 대용량 식당 업소용 반찬 무침 장아찌 (유) 협동맛사랑식품' '[서울,성남 ] 푸릇푸릇 시금치무침 300g [암사 우리집반찬] 주식회사 프레시멘토' '[주문폭주] 농가살리기 30년 전통 통영할매 원조 생굴무침 330g 생굴무침 330g 1통 주식회사 청년농부들'
8.0	'일본식 반찬대용 츠쿠다니 김조림 180g 서울타임즈' '오뚜기 고등어갈치조림양념120g 제이디(JD)' '청우식품 이음식 스지사태조림 200g 푸드뱅크(주)'
4.0	'[종가집]종가집 오징어채볶음 60g 에스케이스토아주식회사' '[반찬가게 찬장]신선한재료 당일제조 배송 고사리볶음 가정식 반찬 집밥 나물/무침/볶음 배달 밑반찬_건파래무침 주식회사 찬장에프에스대전' '청정원 종가집 견과류 멸치볶음 60G 조은마켓'
7.0	'종가집 옛맛 무말랭이 1kg x 2개 더빈(THE BIN)' '반찬단지 마늘쫑무침 1kg 아삭 마늘장아찌 반찬거리 와이엘플래닛' '가을무를 말려 쫄깃하고 달큰한 국산 무말랭이 1kg 1. 국산 무말랭이 1kg 주식회사 태극인 농업회사법인'
0.0	'씨제이 비비고 오징어채 볶음 55g 아이스박스 포장 (주)씨티케이이비전코리아' '매운 고추부각 튀각 30g 6봉 티각태각 속초 명품 특산물 김부각30g 6봉 엠앤엠컴퍼니' '대구 반고개 무침회 똘똘이식당 납작만두 오징어 회무침 캠핑 밀키트 무침회세트(중)_보통맛 대구 똘똘이 무침회'
3.0	'미자언니네 밑반찬 하얀콩강정 120g 1팩 미자언니네 하얀콩강정 에센셜키친' '[메인반찬 국 찌개 김치 세트] 건강한 반찬 이기는면역찬 메인반찬_계란말이 이기는면역찬(서초점)' '[본죽] 밑반찬 5종 세트(진미채볶음 멸치볶음 깻잎무침 무말랭이 궁채절임) 메가글로벌001'

Evaluation

Metrics

Label	Metric
all	0.9102

Uses

Direct Use for Inference

First install the SetFit library:

pip install setfit

Then you can load this model and run inference.

from setfit import SetFitModel

# Download from the 🤗 Hub
model = SetFitModel.from_pretrained("mini1013/master_cate_fd9")
# Run inference
preds = model("본죽 쇠고기 장조림 170g x 4  마이엘(Maiel)")

Training Details

Training Set Metrics

Training set	Min	Median	Max
Word count	3	10.1981	21

Label	Training Sample Count
0.0	50
1.0	42
2.0	22
3.0	50
4.0	50
5.0	50
6.0	50
7.0	50
8.0	50

Training Hyperparameters

batch_size: (512, 512)
num_epochs: (20, 20)
max_steps: -1
sampling_strategy: oversampling
num_iterations: 40
body_learning_rate: (2e-05, 2e-05)
head_learning_rate: 2e-05
loss: CosineSimilarityLoss
distance_metric: cosine_distance
margin: 0.25
end_to_end: False
use_amp: False
warmup_proportion: 0.1
seed: 42
eval_max_steps: -1
load_best_model_at_end: False

Training Results

Epoch	Step	Training Loss	Validation Loss
0.0154	1	0.4845	-
0.7692	50	0.2975	-
1.5385	100	0.0992	-
2.3077	150	0.0418	-
3.0769	200	0.0246	-
3.8462	250	0.0358	-
4.6154	300	0.0185	-
5.3846	350	0.0123	-
6.1538	400	0.0121	-
6.9231	450	0.0008	-
7.6923	500	0.0003	-
8.4615	550	0.0002	-
9.2308	600	0.0001	-
10.0	650	0.0001	-
10.7692	700	0.0001	-
11.5385	750	0.0002	-
12.3077	800	0.0001	-
13.0769	850	0.0001	-
13.8462	900	0.0001	-
14.6154	950	0.0001	-
15.3846	1000	0.0001	-
16.1538	1050	0.0001	-
16.9231	1100	0.0001	-
17.6923	1150	0.0001	-
18.4615	1200	0.0001	-
19.2308	1250	0.0001	-
20.0	1300	0.0001	-

Framework Versions

Python: 3.10.12
SetFit: 1.1.0.dev0
Sentence Transformers: 3.1.1
Transformers: 4.46.1
PyTorch: 2.4.0+cu121
Datasets: 2.20.0
Tokenizers: 0.20.0

Citation

BibTeX

@article{https://doi.org/10.48550/arxiv.2209.11055,
    doi = {10.48550/ARXIV.2209.11055},
    url = {https://arxiv.org/abs/2209.11055},
    author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren},
    keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
    title = {Efficient Few-Shot Learning Without Prompts},
    publisher = {arXiv},
    year = {2022},
    copyright = {Creative Commons Attribution 4.0 International}
}

Downloads last month: -

Safetensors

Model size

0.1B params

Tensor type

F32

Model tree for mini1013/master_cate_fd9

Base model

klue/roberta-base

Finetuned

mini1013/master_domain

Finetuned

(214)

this model

Paper for mini1013/master_cate_fd9

Efficient Few-Shot Learning Without Prompts

Paper • 2209.11055 • Published Sep 22, 2022 • 7

Evaluation results

Metric on Unknown
test set self-reported

0.910