Upload folder using huggingface_hub

2eea151 verified 10 months ago

17.1 kB

	---
	tags:
	- sentence-transformers
	- sentence-similarity
	- feature-extraction
	- generated_from_trainer
	- dataset_size:267
	- loss:ContrastiveLoss
	base_model: sentence-transformers/all-MiniLM-L6-v2
	widget:
	- source_sentence: 'hypertension

	The patient''s primary diagnosis is hypertension, as stated in the visit note.

	BP medications

	The patient is on BP medications which are used to treat hypertension.

	BP management

	The visit note mentions follow-up on BP management, indicating ongoing treatment
	for hypertension.

	HTN

	HTN is the abbreviation for hypertension, which is the patient''s diagnosed condition.

	BP was measured at 138/90

	This blood pressure reading supports the diagnosis of hypertension as it is elevated.

	monthly bp at home have been around that number or higher

	Consistently high blood pressure readings confirm the presence of hypertension.

	most likely diagnosis for this patient is hypertension

	The visit note explicitly states that hypertension is the most likely diagnosis.'
	sentences:
	- Anemia, Unspecified
	- Essential (Primary) Hypertension
	- Dehydration
	- source_sentence: 'BMI ABOVE NORMAL PARAM F/U DOCUMENTED

	This phrase indicates that the patient''s BMI is above normal parameters and requires
	follow-up, which is a key indicator for obesity classification.

	34.11

	The specific BMI value of 34.11 falls within the range for Class 1 obesity (30.0-34.9),
	providing numerical confirmation of the diagnosis.

	Class 1 obesity

	This is the explicit statement of the patient''s condition, directly aligning
	with the ICD code E66.811 for Class 1 obesity.'
	sentences:
	- Obesity, Class 1
	- Hypothyroidism, Unspecified
	- Overweight
	- source_sentence: 'anxious and uses food for comfort

	This phrase indicates the presence of anxiety symptoms, specifically using food
	as a coping mechanism, which aligns with an unspecified anxiety disorder.'
	sentences:
	- Essential (Primary) Hypertension
	- Essential (Primary) Hypertension
	- Anxiety Disorder, Unspecified
	- source_sentence: 'compression stockings

	Compression stockings are a treatment for venous insufficiency, which can cause
	localized edema.

	venous insufficiency

	Venous insufficiency is a condition that leads to leg edema, which is a type of
	localized edema.

	Leg edema

	Leg edema is a direct symptom of localized edema.

	edema

	Edema refers to swelling caused by fluid retention, which aligns with the ICD
	code R60.0 for Localized Edema.'
	sentences:
	- Nasal Congestion
	- Localized Edema
	- Essential (Primary) Hypertension
	- source_sentence: 'Had lithotripsy and passed an 8x5 mm stone on L.

	This phrase indicates a history of urinary calculi as evidenced by the treatment
	(lithotripsy) for kidney stones.'
	sentences:
	- Pure Hypercholesterolemia, Unspecified
	- Personal History Of Urinary Calculi
	- Menopausal And Female Climacteric States
	pipeline_tag: sentence-similarity
	library_name: sentence-transformers
	---

	# SentenceTransformer based on sentence-transformers/all-MiniLM-L6-v2

	This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [sentence-transformers/all-MiniLM-L6-v2](https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2). It maps sentences & paragraphs to a 384-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

	## Model Details

	### Model Description
	- Model Type: Sentence Transformer
	- Base model: [sentence-transformers/all-MiniLM-L6-v2](https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2) <!-- at revision c9745ed1d9f207416be6d2e6f8de32d1f16199bf -->
	- Maximum Sequence Length: 256 tokens
	- Output Dimensionality: 384 dimensions
	- Similarity Function: Cosine Similarity
	<!-- - Training Dataset: Unknown -->
	<!-- - Language: Unknown -->
	<!-- - License: Unknown -->

	### Model Sources

	- Documentation: [Sentence Transformers Documentation](https://sbert.net)
	- Repository: [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers)
	- Hugging Face: [Sentence Transformers on Hugging Face](https://huggingface.co/models?library=sentence-transformers)

	### Full Model Architecture

	```
	SentenceTransformer(
	(0): Transformer({'max_seq_length': 256, 'do_lower_case': False}) with Transformer model: BertModel
	(1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
	(2): Normalize()
	)
	```

	## Usage

	### Direct Usage (Sentence Transformers)

	First install the Sentence Transformers library:

	```bash
	pip install -U sentence-transformers
	```

	Then you can load this model and run inference.
	```python
	from sentence_transformers import SentenceTransformer

	# Download from the 🤗 Hub
	model = SentenceTransformer("sentence_transformers_model_id")
	# Run inference
	sentences = [
	'Had lithotripsy and passed an 8x5 mm stone on L.\nThis phrase indicates a history of urinary calculi as evidenced by the treatment (lithotripsy) for kidney stones.',
	'Personal History Of Urinary Calculi',
	'Pure Hypercholesterolemia, Unspecified',
	]
	embeddings = model.encode(sentences)
	print(embeddings.shape)
	# [3, 384]

	# Get the similarity scores for the embeddings
	similarities = model.similarity(embeddings, embeddings)
	print(similarities.shape)
	# [3, 3]
	```

	<!--
	### Direct Usage (Transformers)

	<details><summary>Click to see the direct usage in Transformers</summary>

	</details>
	-->

	<!--
	### Downstream Usage (Sentence Transformers)

	You can finetune this model on your own dataset.

	<details><summary>Click to expand</summary>

	</details>
	-->

	<!--
	### Out-of-Scope Use

	List how the model may foreseeably be misused and address what users ought not to do with the model.
	-->

	<!--
	## Bias, Risks and Limitations

	What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.
	-->

	<!--
	### Recommendations

	What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.
	-->

	## Training Details

	### Training Dataset

	#### Unnamed Dataset

	* Size: 267 training samples
	* Columns: <code>anchor</code>, <code>positive</code>, and <code>label</code>
	* Approximate statistics based on the first 267 samples:
	\| \| anchor \| positive \| label \|
	\|:--------\|:------------------------------------------------------------------------------------\|:---------------------------------------------------------------------------------\|:--------------------------------------------------------------\|
	\| type \| string \| string \| float \|
	\| details \| <ul><li>min: 12 tokens</li><li>mean: 94.12 tokens</li><li>max: 256 tokens</li></ul> \| <ul><li>min: 3 tokens</li><li>mean: 9.77 tokens</li><li>max: 23 tokens</li></ul> \| <ul><li>min: 1.0</li><li>mean: 1.0</li><li>max: 1.0</li></ul> \|
	* Samples:
	\| anchor \| positive \| label \|
	\|:------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------\|:------------------------------------------------------------\|:-----------------\|
	\| <code>T2DM<br>Directly indicates the diagnosis of Type 2 Diabetes Mellitus without complications as stated in the Problem/Dx section.</code> \| <code>Type 2 Diabetes Mellitus Without Complications</code> \| <code>1.0</code> \|
	\| <code>Atorvastatin<br>Atorvastatin is a statin medication prescribed to lower cholesterol levels, directly addressing hypercholesterolemia.<br>Hyperlipidemia<br>Hyperlipidemia is a broader term that includes high cholesterol (hypercholesterolemia), which is explicitly mentioned in the assessment.<br>statin therapy<br>Statin therapy, including Atorvastatin, is specifically noted as part of the treatment plan for managing high cholesterol.<br>Hypercholesterolemia<br>Explicitly listed under assessment as a condition being managed, aligning with the ICD code E78.00.</code> \| <code>Pure Hypercholesterolemia, Unspecified</code> \| <code>1.0</code> \|
	\| <code>Encounter for immunization (Z23)<br>This phrase directly indicates the ICD code Z23 and its description as the reason for the encounter.</code> \| <code>Encounter For Immunization</code> \| <code>1.0</code> \|
	* Loss: [<code>ContrastiveLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#contrastiveloss) with these parameters:
	```json
	{
	"distance_metric": "SiameseDistanceMetric.COSINE_DISTANCE",
	"margin": 0.5,
	"size_average": true
	}
	```

	### Training Hyperparameters
	#### Non-Default Hyperparameters

	- `per_device_train_batch_size`: 16
	- `per_device_eval_batch_size`: 16
	- `learning_rate`: 2e-05
	- `num_train_epochs`: 1
	- `warmup_ratio`: 0.1

	#### All Hyperparameters
	<details><summary>Click to expand</summary>

	- `overwrite_output_dir`: False
	- `do_predict`: False
	- `eval_strategy`: no
	- `prediction_loss_only`: True
	- `per_device_train_batch_size`: 16
	- `per_device_eval_batch_size`: 16
	- `per_gpu_train_batch_size`: None
	- `per_gpu_eval_batch_size`: None
	- `gradient_accumulation_steps`: 1
	- `eval_accumulation_steps`: None
	- `torch_empty_cache_steps`: None
	- `learning_rate`: 2e-05
	- `weight_decay`: 0.0
	- `adam_beta1`: 0.9
	- `adam_beta2`: 0.999
	- `adam_epsilon`: 1e-08
	- `max_grad_norm`: 1.0
	- `num_train_epochs`: 1
	- `max_steps`: -1
	- `lr_scheduler_type`: linear
	- `lr_scheduler_kwargs`: {}
	- `warmup_ratio`: 0.1
	- `warmup_steps`: 0
	- `log_level`: passive
	- `log_level_replica`: warning
	- `log_on_each_node`: True
	- `logging_nan_inf_filter`: True
	- `save_safetensors`: True
	- `save_on_each_node`: False
	- `save_only_model`: False
	- `restore_callback_states_from_checkpoint`: False
	- `no_cuda`: False
	- `use_cpu`: False
	- `use_mps_device`: False
	- `seed`: 42
	- `data_seed`: None
	- `jit_mode_eval`: False
	- `use_ipex`: False
	- `bf16`: False
	- `fp16`: False
	- `fp16_opt_level`: O1
	- `half_precision_backend`: auto
	- `bf16_full_eval`: False
	- `fp16_full_eval`: False
	- `tf32`: None
	- `local_rank`: 0
	- `ddp_backend`: None
	- `tpu_num_cores`: None
	- `tpu_metrics_debug`: False
	- `debug`: []
	- `dataloader_drop_last`: False
	- `dataloader_num_workers`: 0
	- `dataloader_prefetch_factor`: None
	- `past_index`: -1
	- `disable_tqdm`: False
	- `remove_unused_columns`: True
	- `label_names`: None
	- `load_best_model_at_end`: False
	- `ignore_data_skip`: False
	- `fsdp`: []
	- `fsdp_min_num_params`: 0
	- `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
	- `tp_size`: 0
	- `fsdp_transformer_layer_cls_to_wrap`: None
	- `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
	- `deepspeed`: None
	- `label_smoothing_factor`: 0.0
	- `optim`: adamw_torch
	- `optim_args`: None
	- `adafactor`: False
	- `group_by_length`: False
	- `length_column_name`: length
	- `ddp_find_unused_parameters`: None
	- `ddp_bucket_cap_mb`: None
	- `ddp_broadcast_buffers`: False
	- `dataloader_pin_memory`: True
	- `dataloader_persistent_workers`: False
	- `skip_memory_metrics`: True
	- `use_legacy_prediction_loop`: False
	- `push_to_hub`: False
	- `resume_from_checkpoint`: None
	- `hub_model_id`: None
	- `hub_strategy`: every_save
	- `hub_private_repo`: None
	- `hub_always_push`: False
	- `gradient_checkpointing`: False
	- `gradient_checkpointing_kwargs`: None
	- `include_inputs_for_metrics`: False
	- `include_for_metrics`: []
	- `eval_do_concat_batches`: True
	- `fp16_backend`: auto
	- `push_to_hub_model_id`: None
	- `push_to_hub_organization`: None
	- `mp_parameters`:
	- `auto_find_batch_size`: False
	- `full_determinism`: False
	- `torchdynamo`: None
	- `ray_scope`: last
	- `ddp_timeout`: 1800
	- `torch_compile`: False
	- `torch_compile_backend`: None
	- `torch_compile_mode`: None
	- `include_tokens_per_second`: False
	- `include_num_input_tokens_seen`: False
	- `neftune_noise_alpha`: None
	- `optim_target_modules`: None
	- `batch_eval_metrics`: False
	- `eval_on_start`: False
	- `use_liger_kernel`: False
	- `eval_use_gather_object`: False
	- `average_tokens_across_devices`: False
	- `prompts`: None
	- `batch_sampler`: batch_sampler
	- `multi_dataset_batch_sampler`: proportional

	</details>

	### Training Logs
	\| Epoch \| Step \| Training Loss \|
	\|:------:\|:----:\|:-------------:\|
	\| 0.0588 \| 1 \| 0.1007 \|
	\| 0.1176 \| 2 \| 0.1131 \|
	\| 0.1765 \| 3 \| 0.099 \|
	\| 0.2353 \| 4 \| 0.0867 \|
	\| 0.2941 \| 5 \| 0.0682 \|
	\| 0.3529 \| 6 \| 0.1019 \|
	\| 0.4118 \| 7 \| 0.0618 \|
	\| 0.4706 \| 8 \| 0.0623 \|
	\| 0.5294 \| 9 \| 0.0564 \|
	\| 0.5882 \| 10 \| 0.0521 \|
	\| 0.6471 \| 11 \| 0.0545 \|
	\| 0.7059 \| 12 \| 0.0335 \|
	\| 0.7647 \| 13 \| 0.0593 \|
	\| 0.8235 \| 14 \| 0.0381 \|
	\| 0.8824 \| 15 \| 0.0308 \|
	\| 0.9412 \| 16 \| 0.0487 \|
	\| 1.0 \| 17 \| 0.0398 \|


	### Framework Versions
	- Python: 3.11.12
	- Sentence Transformers: 3.4.1
	- Transformers: 4.51.3
	- PyTorch: 2.6.0+cu124
	- Accelerate: 1.5.2
	- Datasets: 3.5.0
	- Tokenizers: 0.21.1

	## Citation

	### BibTeX

	#### Sentence Transformers
	```bibtex
	@inproceedings{reimers-2019-sentence-bert,
	title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
	author = "Reimers, Nils and Gurevych, Iryna",
	booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
	month = "11",
	year = "2019",
	publisher = "Association for Computational Linguistics",
	url = "https://arxiv.org/abs/1908.10084",
	}
	```

	#### ContrastiveLoss
	```bibtex
	@inproceedings{hadsell2006dimensionality,
	author={Hadsell, R. and Chopra, S. and LeCun, Y.},
	booktitle={2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06)},
	title={Dimensionality Reduction by Learning an Invariant Mapping},
	year={2006},
	volume={2},
	number={},
	pages={1735-1742},
	doi={10.1109/CVPR.2006.100}
	}
	```

	<!--
	## Glossary

	Clearly define terms in order to be accessible across audiences.
	-->

	<!--
	## Model Card Authors

	Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.
	-->

	<!--
	## Model Card Contact

	Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.
	-->