AkshaySandbox commited on
Commit
1a5d9be
·
verified ·
1 Parent(s): 6645c1a

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +378 -22
README.md CHANGED
@@ -1,38 +1,394 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
 
2
- # Pregnancy Knowledge Base Embedding Model
3
 
4
- This is a fine-tuned version of sentence-transformers/all-mpnet-base-v2 specifically optimized for pregnancy, childbirth, and parenting information with a focus on Canadian healthcare context.
5
 
6
- ## Model Description
7
 
8
- - **Base Model:** sentence-transformers/all-mpnet-base-v2
9
- - **Fine-tuned on:** Question-answer pairs related to pregnancy, childbirth, and parenting in Canada
10
- - **Embedding Dimensions:** 768
11
- - **Max Sequence Length:** 384
12
- - **Training Data:** Synthetic question-answer pairs generated from Canadian pregnancy and parenting resources
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
13
 
14
  ## Usage
15
 
 
 
 
 
 
 
 
 
 
16
  ```python
17
  from sentence_transformers import SentenceTransformer
18
 
19
- # Load the model
20
- model = SentenceTransformer('AkshaySandbox/pregnancy-mpnet-embeddings')
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
21
 
22
- # Generate embeddings
23
- embeddings = model.encode(['What are the signs of preeclampsia?',
24
- 'How much parental leave can I get in Canada?'])
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
25
  ```
26
 
27
- ## Intended Use
 
 
 
 
 
 
 
28
 
29
- This model is designed for retrieval-augmented generation (RAG) systems focused on providing pregnancy and parenting information in the Canadian healthcare context.
 
30
 
31
- ## Training
 
32
 
33
- The model was fine-tuned using contrastive learning on question-answer pairs with the following parameters:
34
- - Batch size: 16
35
- - Learning rate: 2e-05
36
- - Epochs: 3
37
- - Loss function: CosineSimilarityLoss
38
-
 
1
+ ---
2
+ tags:
3
+ - sentence-transformers
4
+ - sentence-similarity
5
+ - feature-extraction
6
+ - generated_from_trainer
7
+ - dataset_size:1602
8
+ - loss:CosineSimilarityLoss
9
+ base_model: sentence-transformers/all-mpnet-base-v2
10
+ widget:
11
+ - source_sentence: Has there been any recent discussion on the trend of women choosing
12
+ to become mothers later in life?
13
+ sentences:
14
+ - British Columbia, Ontario, New Brunswick, Nova Scotia, and Prince Edward Island
15
+ have fully implemented universal hearing screening programs.
16
+ - If the first readings exceed the maximum allowable difference, measurements are
17
+ taken for a second and, if necessary, a third time.
18
+ - In recent years, practices have shifted and these professionals are now able to
19
+ observe, assess, and consult on the child’s program at the centre rather than
20
+ in an office visit.
21
+ - source_sentence: Where can I find more information on facilitating extra-provincial
22
+ ward adoptions in British Columbia?
23
+ sentences:
24
+ - 'You can refer to Practice Directive #2021-01 for more information on facilitating
25
+ extra-provincial ward adoptions in British Columbia.'
26
+ - No, a Care Plan is not required if the child/youth has no special service needs.
27
+ - Licensed ECEC programs may fall under the responsibility of one or more ministries
28
+ and departments, including education, health, family, and/or social services.
29
+ - source_sentence: What should be done if there are minor differences in openness
30
+ requests?
31
+ sentences:
32
+ - If there are minor differences, it is advised to try to reach an acceptable compromise
33
+ in a meeting.
34
+ - Yes, the adoption can be completed in B.C. even if the child is from another province.
35
+ However, the originating provincial or territorial child welfare authority is
36
+ responsible for finalizing the adoption.
37
+ - The new standards establish the breastfed child as the normative model for child
38
+ growth and development.
39
+ - source_sentence: Does the federal government in Canada manage Early Childhood Education
40
+ and Care (ECEC)?
41
+ sentences:
42
+ - You can call a friend or relative to ask for help.
43
+ - You can start introducing common food allergens to your baby as they begin eating
44
+ solid foods. It's best to introduce them one at a time.
45
+ - A search of the Parents' Registry should be requested at the time the child or
46
+ youth is registered with the Adoption and Permanency Branch.
47
+ - source_sentence: How can I order a birth certificate in British Columbia?
48
+ sentences:
49
+ - The Hague Convention is an international treaty that sets standards to ensure
50
+ that the best interests of children and youth are protected.
51
+ - Only the consents of the Director of Adoption and the child/youth aged 12 or over
52
+ are required.
53
+ - The L value is -0.4488, the M value is 15.2759, and the S value is 0.08380.
54
+ pipeline_tag: sentence-similarity
55
+ library_name: sentence-transformers
56
+ metrics:
57
+ - pearson_cosine
58
+ - spearman_cosine
59
+ model-index:
60
+ - name: SentenceTransformer based on sentence-transformers/all-mpnet-base-v2
61
+ results:
62
+ - task:
63
+ type: semantic-similarity
64
+ name: Semantic Similarity
65
+ dataset:
66
+ name: pregnancy val
67
+ type: pregnancy_val
68
+ metrics:
69
+ - type: pearson_cosine
70
+ value: 0.9454219117248748
71
+ name: Pearson Cosine
72
+ - type: spearman_cosine
73
+ value: 0.8647267521805166
74
+ name: Spearman Cosine
75
+ ---
76
 
77
+ # SentenceTransformer based on sentence-transformers/all-mpnet-base-v2
78
 
79
+ This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [sentence-transformers/all-mpnet-base-v2](https://huggingface.co/sentence-transformers/all-mpnet-base-v2). It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
80
 
81
+ ## Model Details
82
 
83
+ ### Model Description
84
+ - **Model Type:** Sentence Transformer
85
+ - **Base model:** [sentence-transformers/all-mpnet-base-v2](https://huggingface.co/sentence-transformers/all-mpnet-base-v2) <!-- at revision 9a3225965996d404b775526de6dbfe85d3368642 -->
86
+ - **Maximum Sequence Length:** 384 tokens
87
+ - **Output Dimensionality:** 768 dimensions
88
+ - **Similarity Function:** Cosine Similarity
89
+ <!-- - **Training Dataset:** Unknown -->
90
+ <!-- - **Language:** Unknown -->
91
+ <!-- - **License:** Unknown -->
92
+
93
+ ### Model Sources
94
+
95
+ - **Documentation:** [Sentence Transformers Documentation](https://sbert.net)
96
+ - **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers)
97
+ - **Hugging Face:** [Sentence Transformers on Hugging Face](https://huggingface.co/models?library=sentence-transformers)
98
+
99
+ ### Full Model Architecture
100
+
101
+ ```
102
+ SentenceTransformer(
103
+ (0): Transformer({'max_seq_length': 384, 'do_lower_case': False}) with Transformer model: MPNetModel
104
+ (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
105
+ (2): Normalize()
106
+ )
107
+ ```
108
 
109
  ## Usage
110
 
111
+ ### Direct Usage (Sentence Transformers)
112
+
113
+ First install the Sentence Transformers library:
114
+
115
+ ```bash
116
+ pip install -U sentence-transformers
117
+ ```
118
+
119
+ Then you can load this model and run inference.
120
  ```python
121
  from sentence_transformers import SentenceTransformer
122
 
123
+ # Download from the 🤗 Hub
124
+ model = SentenceTransformer("sentence_transformers_model_id")
125
+ # Run inference
126
+ sentences = [
127
+ 'How can I order a birth certificate in British Columbia?',
128
+ 'Only the consents of the Director of Adoption and the child/youth aged 12 or over are required.',
129
+ 'The L value is -0.4488, the M value is 15.2759, and the S value is 0.08380.',
130
+ ]
131
+ embeddings = model.encode(sentences)
132
+ print(embeddings.shape)
133
+ # [3, 768]
134
+
135
+ # Get the similarity scores for the embeddings
136
+ similarities = model.similarity(embeddings, embeddings)
137
+ print(similarities.shape)
138
+ # [3, 3]
139
+ ```
140
+
141
+ <!--
142
+ ### Direct Usage (Transformers)
143
+
144
+ <details><summary>Click to see the direct usage in Transformers</summary>
145
+
146
+ </details>
147
+ -->
148
+
149
+ <!--
150
+ ### Downstream Usage (Sentence Transformers)
151
+
152
+ You can finetune this model on your own dataset.
153
+
154
+ <details><summary>Click to expand</summary>
155
+
156
+ </details>
157
+ -->
158
+
159
+ <!--
160
+ ### Out-of-Scope Use
161
+
162
+ *List how the model may foreseeably be misused and address what users ought not to do with the model.*
163
+ -->
164
+
165
+ ## Evaluation
166
+
167
+ ### Metrics
168
 
169
+ #### Semantic Similarity
170
+
171
+ * Dataset: `pregnancy_val`
172
+ * Evaluated with [<code>EmbeddingSimilarityEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.EmbeddingSimilarityEvaluator)
173
+
174
+ | Metric | Value |
175
+ |:--------------------|:-----------|
176
+ | pearson_cosine | 0.9454 |
177
+ | **spearman_cosine** | **0.8647** |
178
+
179
+ <!--
180
+ ## Bias, Risks and Limitations
181
+
182
+ *What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
183
+ -->
184
+
185
+ <!--
186
+ ### Recommendations
187
+
188
+ *What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
189
+ -->
190
+
191
+ ## Training Details
192
+
193
+ ### Training Dataset
194
+
195
+ #### Unnamed Dataset
196
+
197
+ * Size: 1,602 training samples
198
+ * Columns: <code>sentence_0</code>, <code>sentence_1</code>, and <code>label</code>
199
+ * Approximate statistics based on the first 1000 samples:
200
+ | | sentence_0 | sentence_1 | label |
201
+ |:--------|:----------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|:---------------------------------------------------------------|
202
+ | type | string | string | float |
203
+ | details | <ul><li>min: 7 tokens</li><li>mean: 16.16 tokens</li><li>max: 34 tokens</li></ul> | <ul><li>min: 9 tokens</li><li>mean: 28.61 tokens</li><li>max: 75 tokens</li></ul> | <ul><li>min: 0.0</li><li>mean: 0.51</li><li>max: 1.0</li></ul> |
204
+ * Samples:
205
+ | sentence_0 | sentence_1 | label |
206
+ |:----------------------------------------------------------------------------------------------------------------|:--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:-----------------|
207
+ | <code>What kind of hearing screening programs do other provinces and territories in Canada have?</code> | <code>Unless an adoption placement has already been secured and a brief interim placement with a caregiver is required, the child should be 6 months of age or younger.</code> | <code>0.0</code> |
208
+ | <code>Are there resources available for children with learning disabilities in early childhood programs?</code> | <code>Yes, most PTs dedicate resources, programs or staff to support children with learning disabilities and other special needs.</code> | <code>1.0</code> |
209
+ | <code>What is parental leave?</code> | <code>Parental leave is a type of benefit that allows parents to take time off work after the birth or adoption of a child. The text mentions it but does not provide specific details about the duration or requirements in Canada.</code> | <code>1.0</code> |
210
+ * Loss: [<code>CosineSimilarityLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#cosinesimilarityloss) with these parameters:
211
+ ```json
212
+ {
213
+ "loss_fct": "torch.nn.modules.loss.MSELoss"
214
+ }
215
+ ```
216
+
217
+ ### Training Hyperparameters
218
+ #### Non-Default Hyperparameters
219
+
220
+ - `eval_strategy`: steps
221
+ - `per_device_train_batch_size`: 16
222
+ - `per_device_eval_batch_size`: 16
223
+ - `multi_dataset_batch_sampler`: round_robin
224
+
225
+ #### All Hyperparameters
226
+ <details><summary>Click to expand</summary>
227
+
228
+ - `overwrite_output_dir`: False
229
+ - `do_predict`: False
230
+ - `eval_strategy`: steps
231
+ - `prediction_loss_only`: True
232
+ - `per_device_train_batch_size`: 16
233
+ - `per_device_eval_batch_size`: 16
234
+ - `per_gpu_train_batch_size`: None
235
+ - `per_gpu_eval_batch_size`: None
236
+ - `gradient_accumulation_steps`: 1
237
+ - `eval_accumulation_steps`: None
238
+ - `torch_empty_cache_steps`: None
239
+ - `learning_rate`: 5e-05
240
+ - `weight_decay`: 0.0
241
+ - `adam_beta1`: 0.9
242
+ - `adam_beta2`: 0.999
243
+ - `adam_epsilon`: 1e-08
244
+ - `max_grad_norm`: 1
245
+ - `num_train_epochs`: 3
246
+ - `max_steps`: -1
247
+ - `lr_scheduler_type`: linear
248
+ - `lr_scheduler_kwargs`: {}
249
+ - `warmup_ratio`: 0.0
250
+ - `warmup_steps`: 0
251
+ - `log_level`: passive
252
+ - `log_level_replica`: warning
253
+ - `log_on_each_node`: True
254
+ - `logging_nan_inf_filter`: True
255
+ - `save_safetensors`: True
256
+ - `save_on_each_node`: False
257
+ - `save_only_model`: False
258
+ - `restore_callback_states_from_checkpoint`: False
259
+ - `no_cuda`: False
260
+ - `use_cpu`: False
261
+ - `use_mps_device`: False
262
+ - `seed`: 42
263
+ - `data_seed`: None
264
+ - `jit_mode_eval`: False
265
+ - `use_ipex`: False
266
+ - `bf16`: False
267
+ - `fp16`: False
268
+ - `fp16_opt_level`: O1
269
+ - `half_precision_backend`: auto
270
+ - `bf16_full_eval`: False
271
+ - `fp16_full_eval`: False
272
+ - `tf32`: None
273
+ - `local_rank`: 0
274
+ - `ddp_backend`: None
275
+ - `tpu_num_cores`: None
276
+ - `tpu_metrics_debug`: False
277
+ - `debug`: []
278
+ - `dataloader_drop_last`: False
279
+ - `dataloader_num_workers`: 0
280
+ - `dataloader_prefetch_factor`: None
281
+ - `past_index`: -1
282
+ - `disable_tqdm`: False
283
+ - `remove_unused_columns`: True
284
+ - `label_names`: None
285
+ - `load_best_model_at_end`: False
286
+ - `ignore_data_skip`: False
287
+ - `fsdp`: []
288
+ - `fsdp_min_num_params`: 0
289
+ - `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
290
+ - `fsdp_transformer_layer_cls_to_wrap`: None
291
+ - `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
292
+ - `deepspeed`: None
293
+ - `label_smoothing_factor`: 0.0
294
+ - `optim`: adamw_torch
295
+ - `optim_args`: None
296
+ - `adafactor`: False
297
+ - `group_by_length`: False
298
+ - `length_column_name`: length
299
+ - `ddp_find_unused_parameters`: None
300
+ - `ddp_bucket_cap_mb`: None
301
+ - `ddp_broadcast_buffers`: False
302
+ - `dataloader_pin_memory`: True
303
+ - `dataloader_persistent_workers`: False
304
+ - `skip_memory_metrics`: True
305
+ - `use_legacy_prediction_loop`: False
306
+ - `push_to_hub`: False
307
+ - `resume_from_checkpoint`: None
308
+ - `hub_model_id`: None
309
+ - `hub_strategy`: every_save
310
+ - `hub_private_repo`: None
311
+ - `hub_always_push`: False
312
+ - `gradient_checkpointing`: False
313
+ - `gradient_checkpointing_kwargs`: None
314
+ - `include_inputs_for_metrics`: False
315
+ - `include_for_metrics`: []
316
+ - `eval_do_concat_batches`: True
317
+ - `fp16_backend`: auto
318
+ - `push_to_hub_model_id`: None
319
+ - `push_to_hub_organization`: None
320
+ - `mp_parameters`:
321
+ - `auto_find_batch_size`: False
322
+ - `full_determinism`: False
323
+ - `torchdynamo`: None
324
+ - `ray_scope`: last
325
+ - `ddp_timeout`: 1800
326
+ - `torch_compile`: False
327
+ - `torch_compile_backend`: None
328
+ - `torch_compile_mode`: None
329
+ - `dispatch_batches`: None
330
+ - `split_batches`: None
331
+ - `include_tokens_per_second`: False
332
+ - `include_num_input_tokens_seen`: False
333
+ - `neftune_noise_alpha`: None
334
+ - `optim_target_modules`: None
335
+ - `batch_eval_metrics`: False
336
+ - `eval_on_start`: False
337
+ - `use_liger_kernel`: False
338
+ - `eval_use_gather_object`: False
339
+ - `average_tokens_across_devices`: False
340
+ - `prompts`: None
341
+ - `batch_sampler`: batch_sampler
342
+ - `multi_dataset_batch_sampler`: round_robin
343
+
344
+ </details>
345
+
346
+ ### Training Logs
347
+ | Epoch | Step | pregnancy_val_spearman_cosine |
348
+ |:-----:|:----:|:-----------------------------:|
349
+ | 1.0 | 101 | 0.8647 |
350
+
351
+
352
+ ### Framework Versions
353
+ - Python: 3.13.1
354
+ - Sentence Transformers: 3.4.1
355
+ - Transformers: 4.49.0
356
+ - PyTorch: 2.6.0
357
+ - Accelerate: 1.4.0
358
+ - Datasets: 3.3.2
359
+ - Tokenizers: 0.21.0
360
+
361
+ ## Citation
362
+
363
+ ### BibTeX
364
+
365
+ #### Sentence Transformers
366
+ ```bibtex
367
+ @inproceedings{reimers-2019-sentence-bert,
368
+ title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
369
+ author = "Reimers, Nils and Gurevych, Iryna",
370
+ booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
371
+ month = "11",
372
+ year = "2019",
373
+ publisher = "Association for Computational Linguistics",
374
+ url = "https://arxiv.org/abs/1908.10084",
375
+ }
376
  ```
377
 
378
+ <!--
379
+ ## Glossary
380
+
381
+ *Clearly define terms in order to be accessible across audiences.*
382
+ -->
383
+
384
+ <!--
385
+ ## Model Card Authors
386
 
387
+ *Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
388
+ -->
389
 
390
+ <!--
391
+ ## Model Card Contact
392
 
393
+ *Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
394
+ -->