Henniina commited on
Commit
97158ec
·
verified ·
1 Parent(s): 9155565

Push model using huggingface_hub.

Browse files
1_Pooling/config.json ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "word_embedding_dimension": 768,
3
+ "pooling_mode_cls_token": false,
4
+ "pooling_mode_mean_tokens": true,
5
+ "pooling_mode_max_tokens": false,
6
+ "pooling_mode_mean_sqrt_len_tokens": false,
7
+ "pooling_mode_weightedmean_tokens": false,
8
+ "pooling_mode_lasttoken": false,
9
+ "include_prompt": true
10
+ }
README.md ADDED
@@ -0,0 +1,264 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ tags:
3
+ - setfit
4
+ - sentence-transformers
5
+ - text-classification
6
+ - generated_from_setfit_trainer
7
+ widget:
8
+ - text: Teemu Karhu menetkö noin vaan takuuseen, ettei sodan johdosta näin käy? Ite
9
+ en kyllä menis 100% sanomaan mitään mihin liittyy Putin ja Putinin sota
10
+ - text: Kohta on lisää lapsia sairaalassa koronan vuoksi ☹
11
+ - text: Pirjo Rajakangas pyöräily sekä kävely ovat hyvää liikuntaa
12
+ - text: Hanna-Leena Lahti Niin.. Nuo todelliset tartunyamäärät voivat olla ihan mitä
13
+ tahansa. Mihinkään rajoitustoimiin ei tarvitsisi ryhtyä. Ihmiset voivat itse pitää
14
+ huolta itsestää, ja valtion tehtävä on pitää huolta siitä että hoitokapasiteetti
15
+ riittää. Tällä hetkellä meillä ei ole mitään hätää. Koko Suomessa tehohoidossa
16
+ koronan vuoksi on noin 2p ihmistä. Tehohoitopaikkoja siis riittää vielä vaikka
17
+ ja kuinka jos tarvetta. Korostan, että edelleenkin ovat turvavälit, hyvä hygienia
18
+ ja turhien kontaktien välttäminen kaikkein tärkeintä. Mitään ei tarvitsisi rajoittaa,
19
+ jollei ihmiset olisi niin helvetin tyhmiä, että osaisivat ajatella ihan omilla
20
+ aivoillaan, eikä valtion tarvitsisi heitä opastaa kädestä pitäen kuten jotain
21
+ pieniä lapsia.
22
+ - text: Mika, hallituksella pitää kuitenkin olla jokin pohja johon perustavat päätöksensä.
23
+ Poikkeustilaa ei voi loputtomiin jatkaa vain mutulla, jolloin heidän on kuunneltava
24
+ aiheen ammattilaisia.
25
+ metrics:
26
+ - metric
27
+ pipeline_tag: text-classification
28
+ library_name: setfit
29
+ inference: true
30
+ base_model: TurkuNLP/bert-base-finnish-cased-v1
31
+ model-index:
32
+ - name: SetFit with TurkuNLP/bert-base-finnish-cased-v1
33
+ results:
34
+ - task:
35
+ type: text-classification
36
+ name: Text Classification
37
+ dataset:
38
+ name: Unknown
39
+ type: unknown
40
+ split: test
41
+ metrics:
42
+ - type: metric
43
+ value: 0.7905009479595114
44
+ name: Metric
45
+ ---
46
+
47
+ # SetFit with TurkuNLP/bert-base-finnish-cased-v1
48
+
49
+ This is a [SetFit](https://github.com/huggingface/setfit) model that can be used for Text Classification. This SetFit model uses [TurkuNLP/bert-base-finnish-cased-v1](https://huggingface.co/TurkuNLP/bert-base-finnish-cased-v1) as the Sentence Transformer embedding model. A [LogisticRegression](https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html) instance is used for classification.
50
+
51
+ The model has been trained using an efficient few-shot learning technique that involves:
52
+
53
+ 1. Fine-tuning a [Sentence Transformer](https://www.sbert.net) with contrastive learning.
54
+ 2. Training a classification head with features from the fine-tuned Sentence Transformer.
55
+
56
+ ## Model Details
57
+
58
+ ### Model Description
59
+ - **Model Type:** SetFit
60
+ - **Sentence Transformer body:** [TurkuNLP/bert-base-finnish-cased-v1](https://huggingface.co/TurkuNLP/bert-base-finnish-cased-v1)
61
+ - **Classification head:** a [LogisticRegression](https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html) instance
62
+ - **Maximum Sequence Length:** 512 tokens
63
+ - **Number of Classes:** 2 classes
64
+ <!-- - **Training Dataset:** [Unknown](https://huggingface.co/datasets/unknown) -->
65
+ <!-- - **Language:** Unknown -->
66
+ <!-- - **License:** Unknown -->
67
+
68
+ ### Model Sources
69
+
70
+ - **Repository:** [SetFit on GitHub](https://github.com/huggingface/setfit)
71
+ - **Paper:** [Efficient Few-Shot Learning Without Prompts](https://arxiv.org/abs/2209.11055)
72
+ - **Blogpost:** [SetFit: Efficient Few-Shot Learning Without Prompts](https://huggingface.co/blog/setfit)
73
+
74
+ ### Model Labels
75
+ | Label | Examples |
76
+ |:------|:-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
77
+ | 0 | <ul><li>'Christoffer Högström miten luulet tilanteen parantuneen kun sairaala- ja tehohoito potilaiden määrä on vain kasvanut silloisesta?\nOlet niin totaalisen puusilmäinen ja hallirusvihan vallassa, että tätä on turha jatkaa pitemmälle. Pysy terveenä ja rauhallista joulua!'</li><li>'"Hylkiö" unionin toimesta johon ei kuulu.'</li><li>'Pirjo-Liisa Almonkari-Kuikka en nyt varsinaisesti pelkästään tuota aihetta tarkoittanutkaan. Sekin on kuitenkin vähintään kyseenalaista, koska kyseessä ei ole valmis tuote, vaan hätämyyntiluvalla käytössä oleva ruiske, ja sen seurauksena on niinikään perusoikeudellinen terveydenhuollon taso turvaamattomalla tasolla.'</li></ul> |
78
+ | 1 | <ul><li>'Niko Korpela perustuslakia ei ole rikottu niissä asioissa mitä convoypellet väitti, kaikki mitä kaverit väittää ei ole totta .'</li><li>'Mikään ei ole niin varmaa kuin epävarma. KUKAAN ei millään voi tietää mitä tapahtuu koronan tai ylipäätään minkään suhteen. Joka muuta väittää on typerys...'</li><li>'Wallenius, kaupunginvaltuutettu Ei tietenkään. Hyvinhän me voimme itsekin tuottaa maakaasua ja raakaöljyä. Eikös?'</li></ul> |
79
+
80
+ ## Evaluation
81
+
82
+ ### Metrics
83
+ | Label | Metric |
84
+ |:--------|:-------|
85
+ | **all** | 0.7905 |
86
+
87
+ ## Uses
88
+
89
+ ### Direct Use for Inference
90
+
91
+ First install the SetFit library:
92
+
93
+ ```bash
94
+ pip install setfit
95
+ ```
96
+
97
+ Then you can load this model and run inference.
98
+
99
+ ```python
100
+ from setfit import SetFitModel
101
+
102
+ # Download from the 🤗 Hub
103
+ model = SetFitModel.from_pretrained("Finnish-actions/SetFit-FinBERT1-Avg-challenge")
104
+ # Run inference
105
+ preds = model("Kohta on lisää lapsia sairaalassa koronan vuoksi ☹")
106
+ ```
107
+
108
+ <!--
109
+ ### Downstream Use
110
+
111
+ *List how someone could finetune this model on their own dataset.*
112
+ -->
113
+
114
+ <!--
115
+ ### Out-of-Scope Use
116
+
117
+ *List how the model may foreseeably be misused and address what users ought not to do with the model.*
118
+ -->
119
+
120
+ <!--
121
+ ## Bias, Risks and Limitations
122
+
123
+ *What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
124
+ -->
125
+
126
+ <!--
127
+ ### Recommendations
128
+
129
+ *What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
130
+ -->
131
+
132
+ ## Training Details
133
+
134
+ ### Training Set Metrics
135
+ | Training set | Min | Median | Max |
136
+ |:-------------|:----|:--------|:----|
137
+ | Word count | 1 | 19.9323 | 213 |
138
+
139
+ | Label | Training Sample Count |
140
+ |:------|:----------------------|
141
+ | 0 | 754 |
142
+ | 1 | 88 |
143
+
144
+ ### Training Hyperparameters
145
+ - batch_size: (16, 16)
146
+ - num_epochs: (4, 4)
147
+ - max_steps: -1
148
+ - sampling_strategy: oversampling
149
+ - num_iterations: 6
150
+ - body_learning_rate: (2e-05, 1e-05)
151
+ - head_learning_rate: 0.01
152
+ - loss: CosineSimilarityLoss
153
+ - distance_metric: cosine_distance
154
+ - margin: 0.25
155
+ - end_to_end: False
156
+ - use_amp: False
157
+ - warmup_proportion: 0.1
158
+ - l2_weight: 0.01
159
+ - seed: 42
160
+ - evaluation_strategy: epoch
161
+ - eval_max_steps: -1
162
+ - load_best_model_at_end: False
163
+
164
+ ### Training Results
165
+ | Epoch | Step | Training Loss | Validation Loss |
166
+ |:------:|:----:|:-------------:|:---------------:|
167
+ | 0.0016 | 1 | 0.2574 | - |
168
+ | 0.0791 | 50 | 0.2818 | - |
169
+ | 0.1582 | 100 | 0.2458 | - |
170
+ | 0.2373 | 150 | 0.2051 | - |
171
+ | 0.3165 | 200 | 0.116 | - |
172
+ | 0.3956 | 250 | 0.0396 | - |
173
+ | 0.4747 | 300 | 0.0096 | - |
174
+ | 0.5538 | 350 | 0.0027 | - |
175
+ | 0.6329 | 400 | 0.0008 | - |
176
+ | 0.7120 | 450 | 0.0003 | - |
177
+ | 0.7911 | 500 | 0.0008 | - |
178
+ | 0.8703 | 550 | 0.0003 | - |
179
+ | 0.9494 | 600 | 0.0002 | - |
180
+ | 1.0 | 632 | - | 0.4312 |
181
+ | 1.0285 | 650 | 0.0001 | - |
182
+ | 1.1076 | 700 | 0.0001 | - |
183
+ | 1.1867 | 750 | 0.0001 | - |
184
+ | 1.2658 | 800 | 0.0001 | - |
185
+ | 1.3449 | 850 | 0.0001 | - |
186
+ | 1.4241 | 900 | 0.0001 | - |
187
+ | 1.5032 | 950 | 0.0001 | - |
188
+ | 1.5823 | 1000 | 0.0003 | - |
189
+ | 1.6614 | 1050 | 0.0001 | - |
190
+ | 1.7405 | 1100 | 0.0001 | - |
191
+ | 1.8196 | 1150 | 0.0001 | - |
192
+ | 1.8987 | 1200 | 0.0 | - |
193
+ | 1.9778 | 1250 | 0.0001 | - |
194
+ | 2.0 | 1264 | - | 0.4345 |
195
+ | 2.0570 | 1300 | 0.0 | - |
196
+ | 2.1361 | 1350 | 0.0 | - |
197
+ | 2.2152 | 1400 | 0.0 | - |
198
+ | 2.2943 | 1450 | 0.0 | - |
199
+ | 2.3734 | 1500 | 0.0 | - |
200
+ | 2.4525 | 1550 | 0.0 | - |
201
+ | 2.5316 | 1600 | 0.0 | - |
202
+ | 2.6108 | 1650 | 0.0 | - |
203
+ | 2.6899 | 1700 | 0.0001 | - |
204
+ | 2.7690 | 1750 | 0.0 | - |
205
+ | 2.8481 | 1800 | 0.0 | - |
206
+ | 2.9272 | 1850 | 0.0001 | - |
207
+ | 3.0 | 1896 | - | 0.4053 |
208
+ | 3.0063 | 1900 | 0.0003 | - |
209
+ | 3.0854 | 1950 | 0.0 | - |
210
+ | 3.1646 | 2000 | 0.0 | - |
211
+ | 3.2437 | 2050 | 0.0 | - |
212
+ | 3.3228 | 2100 | 0.0 | - |
213
+ | 3.4019 | 2150 | 0.0 | - |
214
+ | 3.4810 | 2200 | 0.0 | - |
215
+ | 3.5601 | 2250 | 0.0 | - |
216
+ | 3.6392 | 2300 | 0.0 | - |
217
+ | 3.7184 | 2350 | 0.0 | - |
218
+ | 3.7975 | 2400 | 0.0 | - |
219
+ | 3.8766 | 2450 | 0.0 | - |
220
+ | 3.9557 | 2500 | 0.0 | - |
221
+ | 4.0 | 2528 | - | 0.4416 |
222
+
223
+ ### Framework Versions
224
+ - Python: 3.11.9
225
+ - SetFit: 1.1.3
226
+ - Sentence Transformers: 3.2.0
227
+ - Transformers: 4.44.0
228
+ - PyTorch: 2.4.0+cu124
229
+ - Datasets: 2.21.0
230
+ - Tokenizers: 0.19.1
231
+
232
+ ## Citation
233
+
234
+ ### BibTeX
235
+ ```bibtex
236
+ @article{https://doi.org/10.48550/arxiv.2209.11055,
237
+ doi = {10.48550/ARXIV.2209.11055},
238
+ url = {https://arxiv.org/abs/2209.11055},
239
+ author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren},
240
+ keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
241
+ title = {Efficient Few-Shot Learning Without Prompts},
242
+ publisher = {arXiv},
243
+ year = {2022},
244
+ copyright = {Creative Commons Attribution 4.0 International}
245
+ }
246
+ ```
247
+
248
+ <!--
249
+ ## Glossary
250
+
251
+ *Clearly define terms in order to be accessible across audiences.*
252
+ -->
253
+
254
+ <!--
255
+ ## Model Card Authors
256
+
257
+ *Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
258
+ -->
259
+
260
+ <!--
261
+ ## Model Card Contact
262
+
263
+ *Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
264
+ -->
config.json ADDED
@@ -0,0 +1,25 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_name_or_path": "TurkuNLP/bert-base-finnish-cased-v1",
3
+ "architectures": [
4
+ "BertModel"
5
+ ],
6
+ "attention_probs_dropout_prob": 0.1,
7
+ "classifier_dropout": null,
8
+ "hidden_act": "gelu",
9
+ "hidden_dropout_prob": 0.1,
10
+ "hidden_size": 768,
11
+ "initializer_range": 0.02,
12
+ "intermediate_size": 3072,
13
+ "layer_norm_eps": 1e-12,
14
+ "max_position_embeddings": 512,
15
+ "model_type": "bert",
16
+ "num_attention_heads": 12,
17
+ "num_hidden_layers": 12,
18
+ "pad_token_id": 0,
19
+ "position_embedding_type": "absolute",
20
+ "torch_dtype": "float32",
21
+ "transformers_version": "4.44.0",
22
+ "type_vocab_size": 2,
23
+ "use_cache": true,
24
+ "vocab_size": 50105
25
+ }
config_sentence_transformers.json ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "__version__": {
3
+ "sentence_transformers": "3.2.0",
4
+ "transformers": "4.44.0",
5
+ "pytorch": "2.4.0+cu124"
6
+ },
7
+ "prompts": {},
8
+ "default_prompt_name": null,
9
+ "similarity_fn_name": null
10
+ }
config_setfit.json ADDED
@@ -0,0 +1,4 @@
 
 
 
 
 
1
+ {
2
+ "normalize_embeddings": false,
3
+ "labels": null
4
+ }
model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:f51457a73ca7ff7a2e8a819ae744336d80f5e917e3c064b4edeb5475b3a05e4a
3
+ size 498110312
model_head.pkl ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:d950e54928cb7ce04995e8b52c26b84956e0333334e460a6338418f7c08f78a4
3
+ size 7007
modules.json ADDED
@@ -0,0 +1,14 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ [
2
+ {
3
+ "idx": 0,
4
+ "name": "0",
5
+ "path": "",
6
+ "type": "sentence_transformers.models.Transformer"
7
+ },
8
+ {
9
+ "idx": 1,
10
+ "name": "1",
11
+ "path": "1_Pooling",
12
+ "type": "sentence_transformers.models.Pooling"
13
+ }
14
+ ]
sentence_bert_config.json ADDED
@@ -0,0 +1,4 @@
 
 
 
 
 
1
+ {
2
+ "max_seq_length": 512,
3
+ "do_lower_case": false
4
+ }
special_tokens_map.json ADDED
@@ -0,0 +1,7 @@
 
 
 
 
 
 
 
 
1
+ {
2
+ "cls_token": "[CLS]",
3
+ "mask_token": "[MASK]",
4
+ "pad_token": "[PAD]",
5
+ "sep_token": "[SEP]",
6
+ "unk_token": "[UNK]"
7
+ }
tokenizer.json ADDED
The diff for this file is too large to render. See raw diff
 
tokenizer_config.json ADDED
@@ -0,0 +1,55 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "added_tokens_decoder": {
3
+ "0": {
4
+ "content": "[PAD]",
5
+ "lstrip": false,
6
+ "normalized": false,
7
+ "rstrip": false,
8
+ "single_word": false,
9
+ "special": true
10
+ },
11
+ "101": {
12
+ "content": "[UNK]",
13
+ "lstrip": false,
14
+ "normalized": false,
15
+ "rstrip": false,
16
+ "single_word": false,
17
+ "special": true
18
+ },
19
+ "102": {
20
+ "content": "[CLS]",
21
+ "lstrip": false,
22
+ "normalized": false,
23
+ "rstrip": false,
24
+ "single_word": false,
25
+ "special": true
26
+ },
27
+ "103": {
28
+ "content": "[SEP]",
29
+ "lstrip": false,
30
+ "normalized": false,
31
+ "rstrip": false,
32
+ "single_word": false,
33
+ "special": true
34
+ },
35
+ "104": {
36
+ "content": "[MASK]",
37
+ "lstrip": false,
38
+ "normalized": false,
39
+ "rstrip": false,
40
+ "single_word": false,
41
+ "special": true
42
+ }
43
+ },
44
+ "clean_up_tokenization_spaces": true,
45
+ "cls_token": "[CLS]",
46
+ "do_lower_case": false,
47
+ "mask_token": "[MASK]",
48
+ "model_max_length": 512,
49
+ "pad_token": "[PAD]",
50
+ "sep_token": "[SEP]",
51
+ "strip_accents": null,
52
+ "tokenize_chinese_chars": true,
53
+ "tokenizer_class": "BertTokenizer",
54
+ "unk_token": "[UNK]"
55
+ }
vocab.txt ADDED
The diff for this file is too large to render. See raw diff