Henniina commited on
Commit
a448ec8
·
verified ·
1 Parent(s): 9527cbc

Push model using huggingface_hub.

Browse files
1_Pooling/config.json ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "word_embedding_dimension": 768,
3
+ "pooling_mode_cls_token": false,
4
+ "pooling_mode_mean_tokens": true,
5
+ "pooling_mode_max_tokens": false,
6
+ "pooling_mode_mean_sqrt_len_tokens": false,
7
+ "pooling_mode_weightedmean_tokens": false,
8
+ "pooling_mode_lasttoken": false,
9
+ "include_prompt": true
10
+ }
README.md ADDED
@@ -0,0 +1,259 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ tags:
3
+ - setfit
4
+ - sentence-transformers
5
+ - text-classification
6
+ - generated_from_setfit_trainer
7
+ widget:
8
+ - text: Kremlissä sihteeristö on kiireellä kirjoitellut ”da” lappuja, jotka toimittaa
9
+ perille Nopea Kortteliohjuspalvelu Byroo (NKB). Aika kätevästi Venäjä ”äänestää”
10
+ itselleen puolustettavaa aluetta.
11
+ - text: Ville-Aleksi Alatalo tai Keisarillista Venäjää johon Suomi kuului.
12
+ - text: Mika Nieminen 🙋‍♀️
13
+ - text: No jos haluaa lapsensa kotiin jättää, niin luultavasti se lupa joko lomaan
14
+ tai kotiopetukseen tulee koululta. Oman lapsen kohdalla riitti rehtorille ilmoitus
15
+ ja kaikki oli kunnossa ettei nuori tule kouluun. Tehtävät tulee kotiin, niin kuin
16
+ esim. lomalle jäädessä.
17
+ - text: Markus Juuti Kyllä, se vaatii rohkeutta mutta hintansa sillä on. Kaikkia kansalaisia
18
+ ei voi salamurhata. Venäläiset ovat kautta historian pidetty hiljaisina votkalla
19
+ ja leivällä.Nyt olis aika pyrkiä Oikeaan Demokratiaan.
20
+ metrics:
21
+ - metric
22
+ pipeline_tag: text-classification
23
+ library_name: setfit
24
+ inference: true
25
+ base_model: TurkuNLP/bert-base-finnish-cased-v1
26
+ model-index:
27
+ - name: SetFit with TurkuNLP/bert-base-finnish-cased-v1
28
+ results:
29
+ - task:
30
+ type: text-classification
31
+ name: Text Classification
32
+ dataset:
33
+ name: Unknown
34
+ type: unknown
35
+ split: test
36
+ metrics:
37
+ - type: metric
38
+ value: 0.9110831544498011
39
+ name: Metric
40
+ ---
41
+
42
+ # SetFit with TurkuNLP/bert-base-finnish-cased-v1
43
+
44
+ This is a [SetFit](https://github.com/huggingface/setfit) model that can be used for Text Classification. This SetFit model uses [TurkuNLP/bert-base-finnish-cased-v1](https://huggingface.co/TurkuNLP/bert-base-finnish-cased-v1) as the Sentence Transformer embedding model. A [LogisticRegression](https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html) instance is used for classification.
45
+
46
+ The model has been trained using an efficient few-shot learning technique that involves:
47
+
48
+ 1. Fine-tuning a [Sentence Transformer](https://www.sbert.net) with contrastive learning.
49
+ 2. Training a classification head with features from the fine-tuned Sentence Transformer.
50
+
51
+ ## Model Details
52
+
53
+ ### Model Description
54
+ - **Model Type:** SetFit
55
+ - **Sentence Transformer body:** [TurkuNLP/bert-base-finnish-cased-v1](https://huggingface.co/TurkuNLP/bert-base-finnish-cased-v1)
56
+ - **Classification head:** a [LogisticRegression](https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html) instance
57
+ - **Maximum Sequence Length:** 512 tokens
58
+ - **Number of Classes:** 2 classes
59
+ <!-- - **Training Dataset:** [Unknown](https://huggingface.co/datasets/unknown) -->
60
+ <!-- - **Language:** Unknown -->
61
+ <!-- - **License:** Unknown -->
62
+
63
+ ### Model Sources
64
+
65
+ - **Repository:** [SetFit on GitHub](https://github.com/huggingface/setfit)
66
+ - **Paper:** [Efficient Few-Shot Learning Without Prompts](https://arxiv.org/abs/2209.11055)
67
+ - **Blogpost:** [SetFit: Efficient Few-Shot Learning Without Prompts](https://huggingface.co/blog/setfit)
68
+
69
+ ### Model Labels
70
+ | Label | Examples |
71
+ |:------|:--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
72
+ | 0 | <ul><li>'Luulen, että Halla-aho on vaiti.'</li><li>'Anne Nousiainen paikalla on pelastakaa lapset ry , SPR ja monia muita kriisityön osaajia jotka saavat siihen koulutuksen ja tuen ,kaikki vapaaehtoiset tarvitaan mutta kotimaasta käsin voi auttaa turvallisesti ja se on yhtä arvokasta ja tärkeää apua !meidän Ammattilaisten on pidettävä huolta myös maallikko auttajista !kriisityötä 30 v tehneenä yhdyn Anna-Liisa Sallinen kommentteihin jolla myös kokemusta ja osaamista aiheesta .'</li><li>'Anna-Liisa Sallinen Mutta miten sen varmistaa, että rahat menevät oikeaan kohteeseen? Kaikenlaisia huijaushuhuja pyörii esim. SPR:n aiemmissa keräyksissä. Kun nyt Suomi saisi edes niitä aseita liikkeelle. Monta onnistunutta lähetystä on jo mennyt perille yksityisten ihmisten ansiosta. Mä nostan kyllä hattua! Yksikin pelastettu lapsi on vaivan arvoista. 🥲'</li></ul> |
73
+ | 1 | <ul><li>'Pia Rouhiainen sama havainto parin viikon takaa'</li><li>'Helena Miettinen Kyllä.'</li><li>'Maaria Mettovaara todellakin! Antaa noitten naurajien murjottaa eristyksissä. Aika lyhyt tää elämä elää pelossa😤'</li></ul> |
74
+
75
+ ## Evaluation
76
+
77
+ ### Metrics
78
+ | Label | Metric |
79
+ |:--------|:-------|
80
+ | **all** | 0.9111 |
81
+
82
+ ## Uses
83
+
84
+ ### Direct Use for Inference
85
+
86
+ First install the SetFit library:
87
+
88
+ ```bash
89
+ pip install setfit
90
+ ```
91
+
92
+ Then you can load this model and run inference.
93
+
94
+ ```python
95
+ from setfit import SetFitModel
96
+
97
+ # Download from the 🤗 Hub
98
+ model = SetFitModel.from_pretrained("Finnish-actions/SetFit-FinBERT1-Avg-acceptance")
99
+ # Run inference
100
+ preds = model("Mika Nieminen 🙋‍♀️")
101
+ ```
102
+
103
+ <!--
104
+ ### Downstream Use
105
+
106
+ *List how someone could finetune this model on their own dataset.*
107
+ -->
108
+
109
+ <!--
110
+ ### Out-of-Scope Use
111
+
112
+ *List how the model may foreseeably be misused and address what users ought not to do with the model.*
113
+ -->
114
+
115
+ <!--
116
+ ## Bias, Risks and Limitations
117
+
118
+ *What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
119
+ -->
120
+
121
+ <!--
122
+ ### Recommendations
123
+
124
+ *What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
125
+ -->
126
+
127
+ ## Training Details
128
+
129
+ ### Training Set Metrics
130
+ | Training set | Min | Median | Max |
131
+ |:-------------|:----|:--------|:----|
132
+ | Word count | 1 | 20.3800 | 213 |
133
+
134
+ | Label | Training Sample Count |
135
+ |:------|:----------------------|
136
+ | 0 | 765 |
137
+ | 1 | 77 |
138
+
139
+ ### Training Hyperparameters
140
+ - batch_size: (16, 16)
141
+ - num_epochs: (4, 4)
142
+ - max_steps: -1
143
+ - sampling_strategy: oversampling
144
+ - num_iterations: 6
145
+ - body_learning_rate: (2e-05, 1e-05)
146
+ - head_learning_rate: 0.01
147
+ - loss: CosineSimilarityLoss
148
+ - distance_metric: cosine_distance
149
+ - margin: 0.25
150
+ - end_to_end: False
151
+ - use_amp: False
152
+ - warmup_proportion: 0.1
153
+ - l2_weight: 0.01
154
+ - seed: 42
155
+ - evaluation_strategy: epoch
156
+ - eval_max_steps: -1
157
+ - load_best_model_at_end: False
158
+
159
+ ### Training Results
160
+ | Epoch | Step | Training Loss | Validation Loss |
161
+ |:------:|:----:|:-------------:|:---------------:|
162
+ | 0.0016 | 1 | 0.2193 | - |
163
+ | 0.0791 | 50 | 0.2601 | - |
164
+ | 0.1582 | 100 | 0.23 | - |
165
+ | 0.2373 | 150 | 0.1719 | - |
166
+ | 0.3165 | 200 | 0.0702 | - |
167
+ | 0.3956 | 250 | 0.0239 | - |
168
+ | 0.4747 | 300 | 0.0099 | - |
169
+ | 0.5538 | 350 | 0.0027 | - |
170
+ | 0.6329 | 400 | 0.0011 | - |
171
+ | 0.7120 | 450 | 0.0009 | - |
172
+ | 0.7911 | 500 | 0.0005 | - |
173
+ | 0.8703 | 550 | 0.0004 | - |
174
+ | 0.9494 | 600 | 0.0002 | - |
175
+ | 1.0 | 632 | - | 0.2560 |
176
+ | 1.0285 | 650 | 0.0003 | - |
177
+ | 1.1076 | 700 | 0.0002 | - |
178
+ | 1.1867 | 750 | 0.0002 | - |
179
+ | 1.2658 | 800 | 0.0002 | - |
180
+ | 1.3449 | 850 | 0.0001 | - |
181
+ | 1.4241 | 900 | 0.0001 | - |
182
+ | 1.5032 | 950 | 0.0001 | - |
183
+ | 1.5823 | 1000 | 0.0001 | - |
184
+ | 1.6614 | 1050 | 0.0001 | - |
185
+ | 1.7405 | 1100 | 0.0001 | - |
186
+ | 1.8196 | 1150 | 0.0001 | - |
187
+ | 1.8987 | 1200 | 0.0001 | - |
188
+ | 1.9778 | 1250 | 0.0001 | - |
189
+ | 2.0 | 1264 | - | 0.2603 |
190
+ | 2.0570 | 1300 | 0.0001 | - |
191
+ | 2.1361 | 1350 | 0.0001 | - |
192
+ | 2.2152 | 1400 | 0.0001 | - |
193
+ | 2.2943 | 1450 | 0.0016 | - |
194
+ | 2.3734 | 1500 | 0.0001 | - |
195
+ | 2.4525 | 1550 | 0.0001 | - |
196
+ | 2.5316 | 1600 | 0.0001 | - |
197
+ | 2.6108 | 1650 | 0.0001 | - |
198
+ | 2.6899 | 1700 | 0.0001 | - |
199
+ | 2.7690 | 1750 | 0.0001 | - |
200
+ | 2.8481 | 1800 | 0.0001 | - |
201
+ | 2.9272 | 1850 | 0.0001 | - |
202
+ | 3.0 | 1896 | - | 0.2602 |
203
+ | 3.0063 | 1900 | 0.0001 | - |
204
+ | 3.0854 | 1950 | 0.0001 | - |
205
+ | 3.1646 | 2000 | 0.0001 | - |
206
+ | 3.2437 | 2050 | 0.0001 | - |
207
+ | 3.3228 | 2100 | 0.0001 | - |
208
+ | 3.4019 | 2150 | 0.0001 | - |
209
+ | 3.4810 | 2200 | 0.0001 | - |
210
+ | 3.5601 | 2250 | 0.0001 | - |
211
+ | 3.6392 | 2300 | 0.0 | - |
212
+ | 3.7184 | 2350 | 0.0001 | - |
213
+ | 3.7975 | 2400 | 0.0 | - |
214
+ | 3.8766 | 2450 | 0.0 | - |
215
+ | 3.9557 | 2500 | 0.0001 | - |
216
+ | 4.0 | 2528 | - | 0.2601 |
217
+
218
+ ### Framework Versions
219
+ - Python: 3.11.9
220
+ - SetFit: 1.1.3
221
+ - Sentence Transformers: 3.2.0
222
+ - Transformers: 4.44.0
223
+ - PyTorch: 2.4.0+cu124
224
+ - Datasets: 2.21.0
225
+ - Tokenizers: 0.19.1
226
+
227
+ ## Citation
228
+
229
+ ### BibTeX
230
+ ```bibtex
231
+ @article{https://doi.org/10.48550/arxiv.2209.11055,
232
+ doi = {10.48550/ARXIV.2209.11055},
233
+ url = {https://arxiv.org/abs/2209.11055},
234
+ author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren},
235
+ keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
236
+ title = {Efficient Few-Shot Learning Without Prompts},
237
+ publisher = {arXiv},
238
+ year = {2022},
239
+ copyright = {Creative Commons Attribution 4.0 International}
240
+ }
241
+ ```
242
+
243
+ <!--
244
+ ## Glossary
245
+
246
+ *Clearly define terms in order to be accessible across audiences.*
247
+ -->
248
+
249
+ <!--
250
+ ## Model Card Authors
251
+
252
+ *Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
253
+ -->
254
+
255
+ <!--
256
+ ## Model Card Contact
257
+
258
+ *Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
259
+ -->
config.json ADDED
@@ -0,0 +1,25 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_name_or_path": "TurkuNLP/bert-base-finnish-cased-v1",
3
+ "architectures": [
4
+ "BertModel"
5
+ ],
6
+ "attention_probs_dropout_prob": 0.1,
7
+ "classifier_dropout": null,
8
+ "hidden_act": "gelu",
9
+ "hidden_dropout_prob": 0.1,
10
+ "hidden_size": 768,
11
+ "initializer_range": 0.02,
12
+ "intermediate_size": 3072,
13
+ "layer_norm_eps": 1e-12,
14
+ "max_position_embeddings": 512,
15
+ "model_type": "bert",
16
+ "num_attention_heads": 12,
17
+ "num_hidden_layers": 12,
18
+ "pad_token_id": 0,
19
+ "position_embedding_type": "absolute",
20
+ "torch_dtype": "float32",
21
+ "transformers_version": "4.44.0",
22
+ "type_vocab_size": 2,
23
+ "use_cache": true,
24
+ "vocab_size": 50105
25
+ }
config_sentence_transformers.json ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "__version__": {
3
+ "sentence_transformers": "3.2.0",
4
+ "transformers": "4.44.0",
5
+ "pytorch": "2.4.0+cu124"
6
+ },
7
+ "prompts": {},
8
+ "default_prompt_name": null,
9
+ "similarity_fn_name": null
10
+ }
config_setfit.json ADDED
@@ -0,0 +1,4 @@
 
 
 
 
 
1
+ {
2
+ "normalize_embeddings": false,
3
+ "labels": null
4
+ }
model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:f3b9b7eabf7932bfed1347ad24e9c858759f9cee61802d7862b1d15f963c6b41
3
+ size 498110312
model_head.pkl ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:ca5747f0f37afacb5a4923f8da518a8aecfb07f98ce1c6965207d91cc9b359cc
3
+ size 7007
modules.json ADDED
@@ -0,0 +1,14 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ [
2
+ {
3
+ "idx": 0,
4
+ "name": "0",
5
+ "path": "",
6
+ "type": "sentence_transformers.models.Transformer"
7
+ },
8
+ {
9
+ "idx": 1,
10
+ "name": "1",
11
+ "path": "1_Pooling",
12
+ "type": "sentence_transformers.models.Pooling"
13
+ }
14
+ ]
sentence_bert_config.json ADDED
@@ -0,0 +1,4 @@
 
 
 
 
 
1
+ {
2
+ "max_seq_length": 512,
3
+ "do_lower_case": false
4
+ }
special_tokens_map.json ADDED
@@ -0,0 +1,7 @@
 
 
 
 
 
 
 
 
1
+ {
2
+ "cls_token": "[CLS]",
3
+ "mask_token": "[MASK]",
4
+ "pad_token": "[PAD]",
5
+ "sep_token": "[SEP]",
6
+ "unk_token": "[UNK]"
7
+ }
tokenizer.json ADDED
The diff for this file is too large to render. See raw diff
 
tokenizer_config.json ADDED
@@ -0,0 +1,55 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "added_tokens_decoder": {
3
+ "0": {
4
+ "content": "[PAD]",
5
+ "lstrip": false,
6
+ "normalized": false,
7
+ "rstrip": false,
8
+ "single_word": false,
9
+ "special": true
10
+ },
11
+ "101": {
12
+ "content": "[UNK]",
13
+ "lstrip": false,
14
+ "normalized": false,
15
+ "rstrip": false,
16
+ "single_word": false,
17
+ "special": true
18
+ },
19
+ "102": {
20
+ "content": "[CLS]",
21
+ "lstrip": false,
22
+ "normalized": false,
23
+ "rstrip": false,
24
+ "single_word": false,
25
+ "special": true
26
+ },
27
+ "103": {
28
+ "content": "[SEP]",
29
+ "lstrip": false,
30
+ "normalized": false,
31
+ "rstrip": false,
32
+ "single_word": false,
33
+ "special": true
34
+ },
35
+ "104": {
36
+ "content": "[MASK]",
37
+ "lstrip": false,
38
+ "normalized": false,
39
+ "rstrip": false,
40
+ "single_word": false,
41
+ "special": true
42
+ }
43
+ },
44
+ "clean_up_tokenization_spaces": true,
45
+ "cls_token": "[CLS]",
46
+ "do_lower_case": false,
47
+ "mask_token": "[MASK]",
48
+ "model_max_length": 512,
49
+ "pad_token": "[PAD]",
50
+ "sep_token": "[SEP]",
51
+ "strip_accents": null,
52
+ "tokenize_chinese_chars": true,
53
+ "tokenizer_class": "BertTokenizer",
54
+ "unk_token": "[UNK]"
55
+ }
vocab.txt ADDED
The diff for this file is too large to render. See raw diff