manuel-couto-pintos commited on
Commit
be8fc22
·
verified ·
1 Parent(s): e4f7fa9

Add new SentenceTransformer model.

Browse files
1_Pooling/config.json ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "word_embedding_dimension": 768,
3
+ "pooling_mode_cls_token": false,
4
+ "pooling_mode_mean_tokens": true,
5
+ "pooling_mode_max_tokens": false,
6
+ "pooling_mode_mean_sqrt_len_tokens": false,
7
+ "pooling_mode_weightedmean_tokens": false,
8
+ "pooling_mode_lasttoken": false,
9
+ "include_prompt": true
10
+ }
README.md ADDED
@@ -0,0 +1,460 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ base_model: manuel-couto-pintos/roberta_erisk
3
+ datasets: []
4
+ language: []
5
+ library_name: sentence-transformers
6
+ pipeline_tag: sentence-similarity
7
+ tags:
8
+ - sentence-transformers
9
+ - sentence-similarity
10
+ - feature-extraction
11
+ - generated_from_trainer
12
+ - dataset_size:30288
13
+ - loss:MultipleNegativesRankingLoss
14
+ widget:
15
+ - source_sentence: 'An inside look at Martha Stewart''s Stable '
16
+ sentences:
17
+ - I feel like Im suffocating, Im empty and I feel like my heart is drowning. I try
18
+ everyday to be okay but its just so hard
19
+ - How did the Church and the YMCA organization feel about the YMCA being portrayed
20
+ in the song as a gay-friendly place, and the song becoming a gay anthem?
21
+ - 'An inside look at Martha Stewart''s Stable '
22
+ - source_sentence: 'Chemicals in marijuana may fight MRSA infections. '
23
+ sentences:
24
+ - 'Our new opening sequence for The Walking Dead Please let me know what you think.
25
+ This was shot as a class project for an undergrad TV production class. The professor
26
+ really enjoyed it and called it better than the sequence on TV now (which I think
27
+ is shot better than ours, but much more boring).
28
+
29
+
30
+
31
+
32
+ EDIT: not sure if the link attached properly. It''s [here](http://vimeo.com/17275877) '
33
+ - 'Greek central bank warns of ''painful'' euro and EU exit '
34
+ - 'Chemicals in marijuana may fight MRSA infections. '
35
+ - source_sentence: 'Pluggin Opiates.Ive been into opiates for years now, recently
36
+ I read from the FAQ about plugging and been trying it out for about a month or
37
+ so. Currently my DoC is Morphine. Im curious to know if there are many people
38
+ who use that ROA and Im curious in others experience with it. Ive never shot up,
39
+ only oral or plugging. For those who blissfully plug, what is your experience
40
+ like? Does your high last longer or is it less productive as other ROAs? I get
41
+ the best high from plugging and its essentially instant. Id appreciate some advice
42
+ or for you to share your experience with me? Thanks, stay high and mellow my opioid
43
+ lovers Side note: if you havent tried plugging I recommend it 100%, no shame,
44
+ just safe dosing.'
45
+ sentences:
46
+ - New Members IntroIf youre new to the community, introduce yourself!
47
+ - I found a Nat King Cole signed record at an antique mall for $12.50.
48
+ - 'Pluggin Opiates.Ive been into opiates for years now, recently I read from the
49
+ FAQ about plugging and been trying it out for about a month or so. Currently my
50
+ DoC is Morphine. Im curious to know if there are many people who use that ROA
51
+ and Im curious in others experience with it. Ive never shot up, only oral or plugging.
52
+ For those who blissfully plug, what is your experience like? Does your high last
53
+ longer or is it less productive as other ROAs? I get the best high from plugging
54
+ and its essentially instant. Id appreciate some advice or for you to share your
55
+ experience with me? Thanks, stay high and mellow my opioid lovers Side note: if
56
+ you havent tried plugging I recommend it 100%, no shame, just safe dosing.'
57
+ - source_sentence: 'what can i do to be a likeable person? what do people look for
58
+ in friends? what determines our worth as a person? I realized that a lot of my
59
+ problems come from trying to impress people in order for them to like me and possibly
60
+ become friends.
61
+
62
+
63
+
64
+
65
+ but do people really look at all the things you''ve accomplished and all the things
66
+ you''ve done to determine if your worthy of being a friend? apparently that seems
67
+ to be my mindset, and that''s the reason I do things just to impress people
68
+
69
+
70
+
71
+
72
+ so what do people look for in others when determining whether they can be a good
73
+ friend or not. or another way to think of it, what determines our worth as a person? '
74
+ sentences:
75
+ - 'My humble reaction after I win an argument with my SO. '
76
+ - When do i get to use the super troop potion[ask]
77
+ - 'what can i do to be a likeable person? what do people look for in friends? what
78
+ determines our worth as a person? I realized that a lot of my problems come from
79
+ trying to impress people in order for them to like me and possibly become friends.
80
+
81
+
82
+
83
+
84
+ but do people really look at all the things you''ve accomplished and all the things
85
+ you''ve done to determine if your worthy of being a friend? apparently that seems
86
+ to be my mindset, and that''s the reason I do things just to impress people
87
+
88
+
89
+
90
+
91
+ so what do people look for in others when determining whether they can be a good
92
+ friend or not. or another way to think of it, what determines our worth as a person? '
93
+ - source_sentence: 'Goodnight, Texas - The Horse Accident (In Which A Girl Was All
94
+ But Killed) '
95
+ sentences:
96
+ - Getting rid of ants - methods used in SL households?We have an ants problem in
97
+ our home. The small red ants, to be specific. It didnt always use to be this serious
98
+ but now its annoying that we cannot keep any unpacked food (bread, sugar, buns
99
+ etc.) outside the refrigerator for more than one day (with the lockdown weve stacked
100
+ up supplies enough for a week). What methods that you guys use to get rid or control
101
+ ants in your home? Its better if theres a way to completely get rid of them once
102
+ and for all but controlling methods are ok too. Better if theyre easy to find
103
+ in SL as most of the videos Ive found require separate chemicals and ingredients.
104
+ - 'Goodnight, Texas - The Horse Accident (In Which A Girl Was All But Killed) '
105
+ - 'Creating Transit Gateway VPC attachmentWhen you go to create a TGW VPC attachment
106
+ it asks you at the bottom for subnet ID''s. AWS gives the below description for
107
+ this, but I''m having trouble understanding the significance of this option. Does
108
+ this mean I would have to create multiple TGW attachments per subnet per AZ for
109
+ a single UPC with multiple AZ-A subnets? Or as long as the subnets I chose share
110
+ a routing table with other subnets in the same AZ I am good? For example I have
111
+ a VPC with: MGMT Subnet AZ-A using MGMT route table MGMT Subnet AZ-B using MGMT
112
+ route table Data Subnet in AZ-A using DATA route table Data Subnet in AZ-B using
113
+ DATA route table Would I need two TGW attachments, one for data subnets and one
114
+ for MGMT subnets? " For **Subnet IDs**, select one subnet for each Availability
115
+ Zone to be used by the transit gateway to route traffic. You must select at least
116
+ one subnet. You can select only one subnet per Availability Zone. "'
117
+ ---
118
+
119
+ # SentenceTransformer based on manuel-couto-pintos/roberta_erisk
120
+
121
+ This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [manuel-couto-pintos/roberta_erisk](https://huggingface.co/manuel-couto-pintos/roberta_erisk). It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
122
+
123
+ ## Model Details
124
+
125
+ ### Model Description
126
+ - **Model Type:** Sentence Transformer
127
+ - **Base model:** [manuel-couto-pintos/roberta_erisk](https://huggingface.co/manuel-couto-pintos/roberta_erisk) <!-- at revision 9aa8180ee595fe69a8d23c06dc5ee405f4f5d5ac -->
128
+ - **Maximum Sequence Length:** 512 tokens
129
+ - **Output Dimensionality:** 768 tokens
130
+ - **Similarity Function:** Cosine Similarity
131
+ <!-- - **Training Dataset:** Unknown -->
132
+ <!-- - **Language:** Unknown -->
133
+ <!-- - **License:** Unknown -->
134
+
135
+ ### Model Sources
136
+
137
+ - **Documentation:** [Sentence Transformers Documentation](https://sbert.net)
138
+ - **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers)
139
+ - **Hugging Face:** [Sentence Transformers on Hugging Face](https://huggingface.co/models?library=sentence-transformers)
140
+
141
+ ### Full Model Architecture
142
+
143
+ ```
144
+ SentenceTransformer(
145
+ (0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: RobertaModel
146
+ (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
147
+ )
148
+ ```
149
+
150
+ ## Usage
151
+
152
+ ### Direct Usage (Sentence Transformers)
153
+
154
+ First install the Sentence Transformers library:
155
+
156
+ ```bash
157
+ pip install -U sentence-transformers
158
+ ```
159
+
160
+ Then you can load this model and run inference.
161
+ ```python
162
+ from sentence_transformers import SentenceTransformer
163
+
164
+ # Download from the 🤗 Hub
165
+ model = SentenceTransformer("manuel-couto-pintos/roberta_erisk_simcse")
166
+ # Run inference
167
+ sentences = [
168
+ 'Goodnight, Texas - The Horse Accident (In Which A Girl Was All But Killed) ',
169
+ 'Goodnight, Texas - The Horse Accident (In Which A Girl Was All But Killed) ',
170
+ 'Creating Transit Gateway VPC attachmentWhen you go to create a TGW VPC attachment it asks you at the bottom for subnet ID\'s. AWS gives the below description for this, but I\'m having trouble understanding the significance of this option. Does this mean I would have to create multiple TGW attachments per subnet per AZ for a single UPC with multiple AZ-A subnets? Or as long as the subnets I chose share a routing table with other subnets in the same AZ I am good? For example I have a VPC with: MGMT Subnet AZ-A using MGMT route table MGMT Subnet AZ-B using MGMT route table Data Subnet in AZ-A using DATA route table Data Subnet in AZ-B using DATA route table Would I need two TGW attachments, one for data subnets and one for MGMT subnets? " For **Subnet IDs**, select one subnet for each Availability Zone to be used by the transit gateway to route traffic. You must select at least one subnet. You can select only one subnet per Availability Zone. "',
171
+ ]
172
+ embeddings = model.encode(sentences)
173
+ print(embeddings.shape)
174
+ # [3, 768]
175
+
176
+ # Get the similarity scores for the embeddings
177
+ similarities = model.similarity(embeddings, embeddings)
178
+ print(similarities.shape)
179
+ # [3, 3]
180
+ ```
181
+
182
+ <!--
183
+ ### Direct Usage (Transformers)
184
+
185
+ <details><summary>Click to see the direct usage in Transformers</summary>
186
+
187
+ </details>
188
+ -->
189
+
190
+ <!--
191
+ ### Downstream Usage (Sentence Transformers)
192
+
193
+ You can finetune this model on your own dataset.
194
+
195
+ <details><summary>Click to expand</summary>
196
+
197
+ </details>
198
+ -->
199
+
200
+ <!--
201
+ ### Out-of-Scope Use
202
+
203
+ *List how the model may foreseeably be misused and address what users ought not to do with the model.*
204
+ -->
205
+
206
+ <!--
207
+ ## Bias, Risks and Limitations
208
+
209
+ *What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
210
+ -->
211
+
212
+ <!--
213
+ ### Recommendations
214
+
215
+ *What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
216
+ -->
217
+
218
+ ## Training Details
219
+
220
+ ### Training Dataset
221
+
222
+ #### Unnamed Dataset
223
+
224
+
225
+ * Size: 30,288 training samples
226
+ * Columns: <code>sentence_0</code> and <code>sentence_1</code>
227
+ * Approximate statistics based on the first 1000 samples:
228
+ | | sentence_0 | sentence_1 |
229
+ |:--------|:-----------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------|
230
+ | type | string | string |
231
+ | details | <ul><li>min: 9 tokens</li><li>mean: 87.71 tokens</li><li>max: 512 tokens</li></ul> | <ul><li>min: 9 tokens</li><li>mean: 87.71 tokens</li><li>max: 512 tokens</li></ul> |
232
+ * Samples:
233
+ | sentence_0 | sentence_1 |
234
+ |:--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
235
+ | <code>Any teens want to talk about Rush? Being 15, it's hell on earth trying to find others around my age to talk about Rush or (thanks to my interests) anyone to talk about anything. Let me know if you're around 15 (preferably 13-18) and, I dunno, maybe we could make a Skype group/kik group/whatever.<br><br><br><br>Ghost of an edit: no personal info will be shared, don't worry. Felt I should clarify that. </code> | <code>Any teens want to talk about Rush? Being 15, it's hell on earth trying to find others around my age to talk about Rush or (thanks to my interests) anyone to talk about anything. Let me know if you're around 15 (preferably 13-18) and, I dunno, maybe we could make a Skype group/kik group/whatever.<br><br><br><br>Ghost of an edit: no personal info will be shared, don't worry. Felt I should clarify that. </code> |
236
+ | <code>Interesting video about racial inequality in the prison system. </code> | <code>Interesting video about racial inequality in the prison system. </code> |
237
+ | <code>[Intro] 30, F, dweebHi! My names Liz. Nice to meet you all. I love: - drawing - animation (went to animation school but dropped out) - video games (PC, Nintendo Switch, 3DS) - horror - cryptozoology - mystery science theater 3000 - documentaries - true crime - collecting steelbooks - amiibos It was my birthday this month and I just turned 30. I have chronic illness and depression but I try my best every day to stay positive. If you want I can do a little doodle of you if you want to share a picture! Looking forward to meeting everyone</code> | <code>[Intro] 30, F, dweebHi! My names Liz. Nice to meet you all. I love: - drawing - animation (went to animation school but dropped out) - video games (PC, Nintendo Switch, 3DS) - horror - cryptozoology - mystery science theater 3000 - documentaries - true crime - collecting steelbooks - amiibos It was my birthday this month and I just turned 30. I have chronic illness and depression but I try my best every day to stay positive. If you want I can do a little doodle of you if you want to share a picture! Looking forward to meeting everyone</code> |
238
+ * Loss: [<code>MultipleNegativesRankingLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#multiplenegativesrankingloss) with these parameters:
239
+ ```json
240
+ {
241
+ "scale": 20.0,
242
+ "similarity_fct": "cos_sim"
243
+ }
244
+ ```
245
+
246
+ ### Training Hyperparameters
247
+ #### Non-Default Hyperparameters
248
+
249
+ - `per_device_train_batch_size`: 10
250
+ - `per_device_eval_batch_size`: 10
251
+ - `num_train_epochs`: 5
252
+ - `multi_dataset_batch_sampler`: round_robin
253
+
254
+ #### All Hyperparameters
255
+ <details><summary>Click to expand</summary>
256
+
257
+ - `overwrite_output_dir`: False
258
+ - `do_predict`: False
259
+ - `eval_strategy`: no
260
+ - `prediction_loss_only`: True
261
+ - `per_device_train_batch_size`: 10
262
+ - `per_device_eval_batch_size`: 10
263
+ - `per_gpu_train_batch_size`: None
264
+ - `per_gpu_eval_batch_size`: None
265
+ - `gradient_accumulation_steps`: 1
266
+ - `eval_accumulation_steps`: None
267
+ - `torch_empty_cache_steps`: None
268
+ - `learning_rate`: 5e-05
269
+ - `weight_decay`: 0.0
270
+ - `adam_beta1`: 0.9
271
+ - `adam_beta2`: 0.999
272
+ - `adam_epsilon`: 1e-08
273
+ - `max_grad_norm`: 1
274
+ - `num_train_epochs`: 5
275
+ - `max_steps`: -1
276
+ - `lr_scheduler_type`: linear
277
+ - `lr_scheduler_kwargs`: {}
278
+ - `warmup_ratio`: 0.0
279
+ - `warmup_steps`: 0
280
+ - `log_level`: passive
281
+ - `log_level_replica`: warning
282
+ - `log_on_each_node`: True
283
+ - `logging_nan_inf_filter`: True
284
+ - `save_safetensors`: True
285
+ - `save_on_each_node`: False
286
+ - `save_only_model`: False
287
+ - `restore_callback_states_from_checkpoint`: False
288
+ - `no_cuda`: False
289
+ - `use_cpu`: False
290
+ - `use_mps_device`: False
291
+ - `seed`: 42
292
+ - `data_seed`: None
293
+ - `jit_mode_eval`: False
294
+ - `use_ipex`: False
295
+ - `bf16`: False
296
+ - `fp16`: False
297
+ - `fp16_opt_level`: O1
298
+ - `half_precision_backend`: auto
299
+ - `bf16_full_eval`: False
300
+ - `fp16_full_eval`: False
301
+ - `tf32`: None
302
+ - `local_rank`: 0
303
+ - `ddp_backend`: None
304
+ - `tpu_num_cores`: None
305
+ - `tpu_metrics_debug`: False
306
+ - `debug`: []
307
+ - `dataloader_drop_last`: False
308
+ - `dataloader_num_workers`: 0
309
+ - `dataloader_prefetch_factor`: None
310
+ - `past_index`: -1
311
+ - `disable_tqdm`: False
312
+ - `remove_unused_columns`: True
313
+ - `label_names`: None
314
+ - `load_best_model_at_end`: False
315
+ - `ignore_data_skip`: False
316
+ - `fsdp`: []
317
+ - `fsdp_min_num_params`: 0
318
+ - `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
319
+ - `fsdp_transformer_layer_cls_to_wrap`: None
320
+ - `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
321
+ - `deepspeed`: None
322
+ - `label_smoothing_factor`: 0.0
323
+ - `optim`: adamw_torch
324
+ - `optim_args`: None
325
+ - `adafactor`: False
326
+ - `group_by_length`: False
327
+ - `length_column_name`: length
328
+ - `ddp_find_unused_parameters`: None
329
+ - `ddp_bucket_cap_mb`: None
330
+ - `ddp_broadcast_buffers`: False
331
+ - `dataloader_pin_memory`: True
332
+ - `dataloader_persistent_workers`: False
333
+ - `skip_memory_metrics`: True
334
+ - `use_legacy_prediction_loop`: False
335
+ - `push_to_hub`: False
336
+ - `resume_from_checkpoint`: None
337
+ - `hub_model_id`: None
338
+ - `hub_strategy`: every_save
339
+ - `hub_private_repo`: False
340
+ - `hub_always_push`: False
341
+ - `gradient_checkpointing`: False
342
+ - `gradient_checkpointing_kwargs`: None
343
+ - `include_inputs_for_metrics`: False
344
+ - `eval_do_concat_batches`: True
345
+ - `fp16_backend`: auto
346
+ - `push_to_hub_model_id`: None
347
+ - `push_to_hub_organization`: None
348
+ - `mp_parameters`:
349
+ - `auto_find_batch_size`: False
350
+ - `full_determinism`: False
351
+ - `torchdynamo`: None
352
+ - `ray_scope`: last
353
+ - `ddp_timeout`: 1800
354
+ - `torch_compile`: False
355
+ - `torch_compile_backend`: None
356
+ - `torch_compile_mode`: None
357
+ - `dispatch_batches`: None
358
+ - `split_batches`: None
359
+ - `include_tokens_per_second`: False
360
+ - `include_num_input_tokens_seen`: False
361
+ - `neftune_noise_alpha`: None
362
+ - `optim_target_modules`: None
363
+ - `batch_eval_metrics`: False
364
+ - `eval_on_start`: False
365
+ - `eval_use_gather_object`: False
366
+ - `batch_sampler`: batch_sampler
367
+ - `multi_dataset_batch_sampler`: round_robin
368
+
369
+ </details>
370
+
371
+ ### Training Logs
372
+ | Epoch | Step | Training Loss |
373
+ |:------:|:-----:|:-------------:|
374
+ | 0.1651 | 500 | 0.8492 |
375
+ | 0.3301 | 1000 | 0.0013 |
376
+ | 0.4952 | 1500 | 0.0007 |
377
+ | 0.6603 | 2000 | 0.0007 |
378
+ | 0.8254 | 2500 | 0.0003 |
379
+ | 0.9904 | 3000 | 0.0 |
380
+ | 1.1555 | 3500 | 0.0 |
381
+ | 1.3206 | 4000 | 0.0 |
382
+ | 1.4856 | 4500 | 0.0002 |
383
+ | 1.6507 | 5000 | 0.0003 |
384
+ | 1.8158 | 5500 | 0.0003 |
385
+ | 1.9809 | 6000 | 0.0 |
386
+ | 2.1459 | 6500 | 0.0 |
387
+ | 2.3110 | 7000 | 0.0 |
388
+ | 2.4761 | 7500 | 0.0 |
389
+ | 2.6411 | 8000 | 0.0003 |
390
+ | 2.8062 | 8500 | 0.0003 |
391
+ | 2.9713 | 9000 | 0.0 |
392
+ | 3.1363 | 9500 | 0.0 |
393
+ | 3.3014 | 10000 | 0.0 |
394
+ | 3.4665 | 10500 | 0.0002 |
395
+ | 3.6316 | 11000 | 0.0003 |
396
+ | 3.7966 | 11500 | 0.0003 |
397
+ | 3.9617 | 12000 | 0.0 |
398
+ | 4.1268 | 12500 | 0.0 |
399
+ | 4.2918 | 13000 | 0.0 |
400
+ | 4.4569 | 13500 | 0.0 |
401
+ | 4.6220 | 14000 | 0.0003 |
402
+ | 4.7871 | 14500 | 0.0003 |
403
+ | 4.9521 | 15000 | 0.0 |
404
+
405
+
406
+ ### Framework Versions
407
+ - Python: 3.10.14
408
+ - Sentence Transformers: 3.0.1
409
+ - Transformers: 4.44.2
410
+ - PyTorch: 2.0.1+cu117
411
+ - Accelerate: 0.32.0
412
+ - Datasets: 2.20.0
413
+ - Tokenizers: 0.19.1
414
+
415
+ ## Citation
416
+
417
+ ### BibTeX
418
+
419
+ #### Sentence Transformers
420
+ ```bibtex
421
+ @inproceedings{reimers-2019-sentence-bert,
422
+ title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
423
+ author = "Reimers, Nils and Gurevych, Iryna",
424
+ booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
425
+ month = "11",
426
+ year = "2019",
427
+ publisher = "Association for Computational Linguistics",
428
+ url = "https://arxiv.org/abs/1908.10084",
429
+ }
430
+ ```
431
+
432
+ #### MultipleNegativesRankingLoss
433
+ ```bibtex
434
+ @misc{henderson2017efficient,
435
+ title={Efficient Natural Language Response Suggestion for Smart Reply},
436
+ author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
437
+ year={2017},
438
+ eprint={1705.00652},
439
+ archivePrefix={arXiv},
440
+ primaryClass={cs.CL}
441
+ }
442
+ ```
443
+
444
+ <!--
445
+ ## Glossary
446
+
447
+ *Clearly define terms in order to be accessible across audiences.*
448
+ -->
449
+
450
+ <!--
451
+ ## Model Card Authors
452
+
453
+ *Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
454
+ -->
455
+
456
+ <!--
457
+ ## Model Card Contact
458
+
459
+ *Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
460
+ -->
config.json ADDED
@@ -0,0 +1,28 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_name_or_path": "manuel-couto-pintos/roberta_erisk",
3
+ "architectures": [
4
+ "RobertaModel"
5
+ ],
6
+ "attention_probs_dropout_prob": 0.1,
7
+ "bos_token_id": 0,
8
+ "classifier_dropout": null,
9
+ "eos_token_id": 2,
10
+ "hidden_act": "gelu",
11
+ "hidden_dropout_prob": 0.1,
12
+ "hidden_size": 768,
13
+ "initializer_range": 0.02,
14
+ "intermediate_size": 3072,
15
+ "layer_norm_eps": 1e-05,
16
+ "max_position_embeddings": 514,
17
+ "model_type": "roberta",
18
+ "num_attention_heads": 12,
19
+ "num_hidden_layers": 12,
20
+ "pad_token_id": 1,
21
+ "position_embedding_type": "absolute",
22
+ "tokenizer_class": "RobertaTokenizerFast",
23
+ "torch_dtype": "float32",
24
+ "transformers_version": "4.44.2",
25
+ "type_vocab_size": 1,
26
+ "use_cache": true,
27
+ "vocab_size": 50265
28
+ }
config_sentence_transformers.json ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "__version__": {
3
+ "sentence_transformers": "3.0.1",
4
+ "transformers": "4.44.2",
5
+ "pytorch": "2.0.1+cu117"
6
+ },
7
+ "prompts": {},
8
+ "default_prompt_name": null,
9
+ "similarity_fn_name": null
10
+ }
merges.txt ADDED
The diff for this file is too large to render. See raw diff
 
model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:44fc902cab35bec2eefc869e403decd1a8bd4a837542904de24a8d3e162abbb7
3
+ size 498604904
modules.json ADDED
@@ -0,0 +1,14 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ [
2
+ {
3
+ "idx": 0,
4
+ "name": "0",
5
+ "path": "",
6
+ "type": "sentence_transformers.models.Transformer"
7
+ },
8
+ {
9
+ "idx": 1,
10
+ "name": "1",
11
+ "path": "1_Pooling",
12
+ "type": "sentence_transformers.models.Pooling"
13
+ }
14
+ ]
sentence_bert_config.json ADDED
@@ -0,0 +1,4 @@
 
 
 
 
 
1
+ {
2
+ "max_seq_length": 512,
3
+ "do_lower_case": false
4
+ }
special_tokens_map.json ADDED
@@ -0,0 +1,15 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "bos_token": "<s>",
3
+ "cls_token": "<s>",
4
+ "eos_token": "</s>",
5
+ "mask_token": {
6
+ "content": "<mask>",
7
+ "lstrip": true,
8
+ "normalized": false,
9
+ "rstrip": false,
10
+ "single_word": false
11
+ },
12
+ "pad_token": "<pad>",
13
+ "sep_token": "</s>",
14
+ "unk_token": "<unk>"
15
+ }
tokenizer.json ADDED
The diff for this file is too large to render. See raw diff
 
tokenizer_config.json ADDED
@@ -0,0 +1,57 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "add_prefix_space": false,
3
+ "added_tokens_decoder": {
4
+ "0": {
5
+ "content": "<s>",
6
+ "lstrip": false,
7
+ "normalized": true,
8
+ "rstrip": false,
9
+ "single_word": false,
10
+ "special": true
11
+ },
12
+ "1": {
13
+ "content": "<pad>",
14
+ "lstrip": false,
15
+ "normalized": true,
16
+ "rstrip": false,
17
+ "single_word": false,
18
+ "special": true
19
+ },
20
+ "2": {
21
+ "content": "</s>",
22
+ "lstrip": false,
23
+ "normalized": true,
24
+ "rstrip": false,
25
+ "single_word": false,
26
+ "special": true
27
+ },
28
+ "3": {
29
+ "content": "<unk>",
30
+ "lstrip": false,
31
+ "normalized": true,
32
+ "rstrip": false,
33
+ "single_word": false,
34
+ "special": true
35
+ },
36
+ "50264": {
37
+ "content": "<mask>",
38
+ "lstrip": true,
39
+ "normalized": false,
40
+ "rstrip": false,
41
+ "single_word": false,
42
+ "special": true
43
+ }
44
+ },
45
+ "bos_token": "<s>",
46
+ "clean_up_tokenization_spaces": true,
47
+ "cls_token": "<s>",
48
+ "eos_token": "</s>",
49
+ "errors": "replace",
50
+ "mask_token": "<mask>",
51
+ "model_max_length": 512,
52
+ "pad_token": "<pad>",
53
+ "sep_token": "</s>",
54
+ "tokenizer_class": "RobertaTokenizer",
55
+ "trim_offsets": true,
56
+ "unk_token": "<unk>"
57
+ }
vocab.json ADDED
The diff for this file is too large to render. See raw diff