MMContext Model Information
This model uses a custom MMContextEncoder architecture for multimodal embedding generation, combining text and omics data representations.
⚠️ Important: Loading Instructions
This model requires trust_remote_code=True to load properly.
from sentence_transformers import SentenceTransformer
# ✅ CORRECT: Load with trust_remote_code=True
model = SentenceTransformer('jo-mengr/mmcontext-pubmedbert-gs10k-cxg', trust_remote_code=True)
# Generate embeddings
texts = ["Cell type annotation", "Another description"]
embeddings = model.encode(texts)
print(f"Embeddings shape: {embeddings.shape}")
Model Details
- Architecture: MMContextEncoder (custom multimodal architecture)
- Text Encoder: NeuML/pubmedbert-base-embeddings
- Omics Embedding Method: gs10k
- Output Dimension: 2048
- Pooling Strategy: mean
Omics Embedding Method: GS10K
Gene Set enrichment-based embeddings (10k genes)
Usage Tutorial
📓 Tutorial Notebook: usage_tutorial.ipynb - Detailed usage examples and best practices
Model Architecture
The MMContextEncoder combines:
- Text Branch: NeuML/pubmedbert-base-embeddings with optional adapter layers
- Omics Branch: Lookup-based encoder with precomputed gs10k embeddings
- Adapters: Feed-forward projection layers for dimensionality alignment
- Pooling: mean pooling for sentence-level embeddings
Files in this Repository
mmcontextencoder.py: Main model implementationadapters.py: Adapter modules for dimensionality mappingomicsencoder.py: Omics data encoderonehot.py: One-hot text encoderfile_utils.py: Utility functionsusage_tutorial.ipynb: Tutorial notebook with usage examples
Training Details
- Text Encoder: NeuML/pubmedbert-base-embeddings
- Embedding Method: gs10k
- Output Dimension: 2048
- Training Datasets: 2 datasets
- Text-only Datasets: 0 (None)
- Numeric Datasets: 2 (cellxgene_pseudo_bulk_full, geo_half)
- Batch Size: 512
- Learning Rate: 2e-05
- Training Epochs: 16
This model was trained using the MMContext framework for multimodal single-cell analysis.
SentenceTransformer based on NeuML/pubmedbert-base-embeddings
This is a sentence-transformers model finetuned from NeuML/pubmedbert-base-embeddings on the cellxgene_pseudo_bulk_full_cell_sentence_1_caption and geo_half_cell_sentence_1_caption datasets. It maps sentences & paragraphs to a 2048-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
Model Details
Model Description
- Model Type: Sentence Transformer
- Base model: NeuML/pubmedbert-base-embeddings
- Maximum Sequence Length: 512 tokens
- Output Dimensionality: 2048 dimensions
- Similarity Function: Cosine Similarity
- Training Datasets:
- Language: code
Model Sources
- Documentation: Sentence Transformers Documentation
- Repository: Sentence Transformers on GitHub
- Hugging Face: Sentence Transformers on Hugging Face
Full Model Architecture
SentenceTransformer(
(0): MMContextEncoder(
(text_encoder): BertModel(
(embeddings): BertEmbeddings(
(word_embeddings): Embedding(30522, 768, padding_idx=0)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): BertEncoder(
(layer): ModuleList(
(0-11): 12 x BertLayer(
(attention): BertAttention(
(self): BertSdpaSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): BertSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): BertIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): BertOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(pooler): BertPooler(
(dense): Linear(in_features=768, out_features=768, bias=True)
(activation): Tanh()
)
)
(text_adapter): AdapterModule(
(net): Sequential(
(0): Linear(in_features=768, out_features=1024, bias=True)
(1): ReLU(inplace=True)
(2): Linear(in_features=1024, out_features=2048, bias=True)
(3): BatchNorm1d(2048, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
)
(pooling): Pooling({'word_embedding_dimension': 2048, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
(omics_adapter): AdapterModule(
(net): Sequential(
(0): Linear(in_features=10000, out_features=1024, bias=True)
(1): ReLU(inplace=True)
(2): Linear(in_features=1024, out_features=2048, bias=True)
(3): BatchNorm1d(2048, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
)
(omics_encoder): MiniOmicsModel(
(embeddings): Embedding(726794, 10000, padding_idx=0)
)
)
)
Usage
Direct Usage (Sentence Transformers)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("sentence_transformers_model_id")
# Run inference
sentences = [
'sample_idx:census_367b55f4-d543-49aa-90e8-4765fcb8c687_969',
"This measurement was conducted with 10x 3' v3. Neuron cell type from the thalamic complex, specifically the centromedian and parafasicular nuclei (CM and Pf), derived from a 42-year old male.",
"This measurement was conducted with 10x 3' v3. Neuron cell type from a 42-year-old male, specifically from the thalamic complex with thalamic excitatory supercluster term, corresponding to the Thalamus (THM) - intralaminar nuclear complex (ILN) - posterior group of intralaminar nuclei (PILN) - centromedian and parafasicular nuclei - CM and Pf dissection.",
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 2048]
# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities)
# tensor([[1.0000, 0.5200, 0.4699],
# [0.5200, 1.0000, 0.8290],
# [0.4699, 0.8290, 1.0000]])
Evaluation
Metrics
Triplet
- Datasets:
cellxgene_pseudo_bulk_full_cell_sentence_1_captionandgeo_half_cell_sentence_1_caption - Evaluated with
TripletEvaluator
| Metric | cellxgene_pseudo_bulk_full_cell_sentence_1_caption | geo_half_cell_sentence_1_caption |
|---|---|---|
| cosine_accuracy | 0.9267 | 0.7779 |
Training Details
Training Datasets
cellxgene_pseudo_bulk_full_cell_sentence_1_caption
- Dataset: cellxgene_pseudo_bulk_full_cell_sentence_1_caption at 55717c1
- Size: 306,003 training samples
- Columns:
anchor,positive,negative_1, andnegative_2 - Approximate statistics based on the first 1000 samples:
anchor positive negative_1 negative_2 type string string string string details - min: 56 characters
- mean: 58.69 characters
- max: 60 characters
- min: 22 tokens
- mean: 47.33 tokens
- max: 165 tokens
- min: 22 tokens
- mean: 49.45 tokens
- max: 120 tokens
- min: 56 characters
- mean: 58.63 characters
- max: 59 characters
- Samples:
anchor positive negative_1 negative_2 sample_idx:census_9d5df009-eb76-43a3-b6cd-22017cc53700_231This measurement was conducted with 10x 3' v3. Gut endothelial cell derived from proximal colon of a male human fetus at 13th week post-fertilization stage.This measurement was conducted with 10x 3' v3. Mesothelial cell derived from the proximal colon of a male human at 23rd week post-fertilization stage.sample_idx:census_9d5df009-eb76-43a3-b6cd-22017cc53700_521sample_idx:census_367b55f4-d543-49aa-90e8-4765fcb8c687_132This measurement was conducted with 10x 3' v3. Sample is an oligodendrocyte cell from a 29-year-old male human, specifically from the thalamic complex, with European self-reported ethnicity.This measurement was conducted with 10x 3' v3. Neuron cell type from the thalamic complex, specifically the centromedian and parafasicular nuclei (CM and Pf), derived from a 42-year old male human donor.sample_idx:census_367b55f4-d543-49aa-90e8-4765fcb8c687_134sample_idx:census_1e6a6ef9-7ec9-4c90-bbfb-2ad3c3165fd1_9964This measurement was conducted with Smart-seq2. Neutrophil cell type derived from the lung tissue of a 37-year old male with advanced stage non-small cell lung cancer (NSCLC), stage IV, who has never smoked. The cells exhibit an ALK mutation, with no mutations detected in BRAF, EGFR, ERBB2, KRAS, ROS, or TP53.This measurement was conducted with 10x 3' v2. Myeloid cell derived from the lung tissue of a 65-year old male, located in normal adjacent tissue, with advanced non-small cell lung cancer (NSCLC), stage III.sample_idx:census_1e6a6ef9-7ec9-4c90-bbfb-2ad3c3165fd1_972 - Loss:
mmcontext.utils.PerDatasetLossLogger
geo_half_cell_sentence_1_caption
- Dataset: geo_half_cell_sentence_1_caption at bc13ae5
- Size: 348,046 training samples
- Columns:
anchor,positive,negative_1, andnegative_2 - Approximate statistics based on the first 1000 samples:
anchor positive negative_1 negative_2 type string string string string details - min: 20 characters
- mean: 20.0 characters
- max: 20 characters
- min: 17 tokens
- mean: 36.82 tokens
- max: 130 tokens
- min: 19 tokens
- mean: 34.25 tokens
- max: 88 tokens
- min: 20 characters
- mean: 20.0 characters
- max: 20 characters
- Samples:
anchor positive negative_1 negative_2 sample_idx:SRX173216This measurement was conducted with Illumina HiSeq 2000. B-cells from individual GM12004, assayed using global run-on technique. These are primary cells, with no reported treatment.This measurement was conducted with Illumina HiSeq 2000. 48 hour Activin treatment of H1 embryonic stem cells.sample_idx:SRX189728sample_idx:SRX185041This measurement was conducted with Illumina HiSeq 2000. 1000 ng of fragmented total RNA from a cultured chronic myelogenous leukemia (CML) cell line, specifically the human CML cell line K-562. This cell line is derived from a female hematological system disease, specifically a lymphoid neoplasm (leukemia) known as C.M.L., which is a type of neoplasm affecting the bone marrow. The sample has not undergone any treatment.This measurement was conducted with Illumina HiSeq 2000. 1000 ng of fragmented total RNA from a cultured female human Chronic Myelogenous Leukemia (CML) cell line, K-562, which was grown in tissue culture. The sample has not received any treatment.sample_idx:SRX185051sample_idx:SRX185046This measurement was conducted with Illumina HiSeq 2000. 1000 ng of fragmented total RNA from a cultured female human Chronic Myelogenous Leukemia (CML) cell line, K-562, which was grown in tissue culture. The sample has not received any treatment.This measurement was conducted with Illumina HiSeq 2000. 1000 ng of fragmented total RNA from a cultured chronic myelogenous leukemia (CML) cell line, specifically the human CML cell line K-562. This cell line is derived from a female hematological system disease, specifically a lymphoid neoplasm (leukemia) known as C.M.L., which is a type of neoplasm affecting the bone marrow. The sample has not undergone any treatment.sample_idx:SRX185051 - Loss:
mmcontext.utils.PerDatasetLossLogger
Evaluation Datasets
cellxgene_pseudo_bulk_full_cell_sentence_1_caption
- Dataset: cellxgene_pseudo_bulk_full_cell_sentence_1_caption at 55717c1
- Size: 33,937 evaluation samples
- Columns:
anchor,positive,negative_1, andnegative_2 - Approximate statistics based on the first 1000 samples:
anchor positive negative_1 negative_2 type string string string string details - min: 56 characters
- mean: 58.7 characters
- max: 60 characters
- min: 21 tokens
- mean: 47.32 tokens
- max: 147 tokens
- min: 21 tokens
- mean: 44.03 tokens
- max: 88 tokens
- min: 56 characters
- mean: 58.76 characters
- max: 60 characters
- Samples:
anchor positive negative_1 negative_2 sample_idx:census_7db0c178-b0a4-442f-ba54-e9e1633a84bb_763This measurement was conducted with 10x 3' v3. Oligodendrocyte cell sample taken from the cerebral cortex (Cx) of a 42-year-old male, specifically from the human A43 region.This measurement was conducted with 10x 3' v3. Neuron cell type from a 50-year old male, specifically an MGE interneuron, located in the cerebral cortex, parietal operculum, gustatory cortex, A43 region.sample_idx:census_7db0c178-b0a4-442f-ba54-e9e1633a84bb_541sample_idx:census_1e6a6ef9-7ec9-4c90-bbfb-2ad3c3165fd1_18907This measurement was conducted with 10x 3' v2. Endothelial cell, specifically a vein endothelial cell, derived from normal adjacent lung tissue of a 71-year-old female patient with early stage NSCLC (stage I) who has a history of smoking.This measurement was conducted with 10x 3' v2. Endothelial cell derived from the lymphatic vessel of a 69-year-old male with early stage non-small cell lung cancer (NSCLC), stage II. The patient has a history of smoking and the cell was obtained from the primary tumor site.sample_idx:census_1e6a6ef9-7ec9-4c90-bbfb-2ad3c3165fd1_16402sample_idx:census_fd072bc3-2dfb-46f8-b4e3-467cb3223182_3695This measurement was conducted with 10x 3' v2. Endothelial cells collected from the spleen of a male human fetus at 15 weeks post-fertilization.This measurement was conducted with 10x 5' v1. A native cell from the skin of a female human fetus at 11 weeks post-fertilization, identified as a doublet of endothelial and erythrocyte lineage.sample_idx:census_fd072bc3-2dfb-46f8-b4e3-467cb3223182_6661 - Loss:
mmcontext.utils.PerDatasetLossLogger
geo_half_cell_sentence_1_caption
- Dataset: geo_half_cell_sentence_1_caption at bc13ae5
- Size: 38,807 evaluation samples
- Columns:
anchor,positive,negative_1, andnegative_2 - Approximate statistics based on the first 1000 samples:
anchor positive negative_1 negative_2 type string string string string details - min: 20 characters
- mean: 20.19 characters
- max: 21 characters
- min: 16 tokens
- mean: 37.63 tokens
- max: 119 tokens
- min: 16 tokens
- mean: 54.71 tokens
- max: 111 tokens
- min: 20 characters
- mean: 20.04 characters
- max: 21 characters
- Samples:
anchor positive negative_1 negative_2 sample_idx:SRX185061This measurement was conducted with Illumina HiSeq 2000. 1000 ng of fragmented total RNA from a cultured chronic myelogenous leukemia (CML) cell line (K-562) derived from a female hematological system disease (CML). The cells were grown in tissue culture and have undergone Ribo-Zero treatment.This measurement was conducted with Illumina HiSeq 1000. The sample is a cell line (OCI-LY1) derived from a diffuse large B-cell lymphoma (DLBCL), a type of non-Hodgkin lymphoma that affects the lymphatic system. The cells have been treated with siNT (a non-coding siRNA) for 48 hours.sample_idx:SRX188847sample_idx:SRX185895This measurement was conducted with Illumina HiSeq 1000. The sample is a cell line (OCI-LY1) derived from a diffuse large B-cell lymphoma (DLBCL), a type of non-Hodgkin lymphoma that affects the lymphatic system. The cells have been treated with siNT (a non-coding siRNA) for 48 hours.This measurement was conducted with Illumina HiSeq 2000. 1000 ng of fragmented total RNA from a cultured chronic myelogenous leukemia (CML) cell line (K-562) derived from a female hematological system disease (CML). The cells were grown in tissue culture and have undergone Ribo-Zero treatment.sample_idx:SRX188847sample_idx:SRX188847This measurement was conducted with Illumina HiSeq 2000. 48-hour Activin-treated H1 embryonic stem cells.This measurement was conducted with Illumina HiSeq 2000. 1000 ng of fragmented total RNA from a cultured chronic myelogenous leukemia (CML) cell line (K-562) derived from a female hematological system disease (CML). The cells were grown in tissue culture and have undergone Ribo-Zero treatment.sample_idx:SRX185895 - Loss:
mmcontext.utils.PerDatasetLossLogger
Training Hyperparameters
Non-Default Hyperparameters
eval_strategy: stepsper_device_train_batch_size: 512per_device_eval_batch_size: 512learning_rate: 2e-05num_train_epochs: 16warmup_ratio: 0.1bf16: True
All Hyperparameters
Click to expand
overwrite_output_dir: Falsedo_predict: Falseeval_strategy: stepsprediction_loss_only: Trueper_device_train_batch_size: 512per_device_eval_batch_size: 512per_gpu_train_batch_size: Noneper_gpu_eval_batch_size: Nonegradient_accumulation_steps: 1eval_accumulation_steps: Nonetorch_empty_cache_steps: Nonelearning_rate: 2e-05weight_decay: 0.0adam_beta1: 0.9adam_beta2: 0.999adam_epsilon: 1e-08max_grad_norm: 1.0num_train_epochs: 16max_steps: -1lr_scheduler_type: linearlr_scheduler_kwargs: {}warmup_ratio: 0.1warmup_steps: 0log_level: passivelog_level_replica: warninglog_on_each_node: Truelogging_nan_inf_filter: Truesave_safetensors: Truesave_on_each_node: Falsesave_only_model: Falserestore_callback_states_from_checkpoint: Falseno_cuda: Falseuse_cpu: Falseuse_mps_device: Falseseed: 42data_seed: Nonejit_mode_eval: Falsebf16: Truefp16: Falsefp16_opt_level: O1half_precision_backend: autobf16_full_eval: Falsefp16_full_eval: Falsetf32: Nonelocal_rank: 0ddp_backend: Nonetpu_num_cores: Nonetpu_metrics_debug: Falsedebug: []dataloader_drop_last: Falsedataloader_num_workers: 0dataloader_prefetch_factor: Nonepast_index: -1disable_tqdm: Falseremove_unused_columns: Truelabel_names: Noneload_best_model_at_end: Falseignore_data_skip: Falsefsdp: []fsdp_min_num_params: 0fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}fsdp_transformer_layer_cls_to_wrap: Noneaccelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}parallelism_config: Nonedeepspeed: Nonelabel_smoothing_factor: 0.0optim: adamw_torch_fusedoptim_args: Noneadafactor: Falsegroup_by_length: Falselength_column_name: lengthproject: huggingfacetrackio_space_id: trackioddp_find_unused_parameters: Noneddp_bucket_cap_mb: Noneddp_broadcast_buffers: Falsedataloader_pin_memory: Truedataloader_persistent_workers: Falseskip_memory_metrics: Trueuse_legacy_prediction_loop: Falsepush_to_hub: Falseresume_from_checkpoint: Nonehub_model_id: Nonehub_strategy: every_savehub_private_repo: Nonehub_always_push: Falsehub_revision: Nonegradient_checkpointing: Falsegradient_checkpointing_kwargs: Noneinclude_inputs_for_metrics: Falseinclude_for_metrics: []eval_do_concat_batches: Truefp16_backend: autopush_to_hub_model_id: Nonepush_to_hub_organization: Nonemp_parameters:auto_find_batch_size: Falsefull_determinism: Falsetorchdynamo: Noneray_scope: lastddp_timeout: 1800torch_compile: Falsetorch_compile_backend: Nonetorch_compile_mode: Noneinclude_tokens_per_second: Falseinclude_num_input_tokens_seen: noneftune_noise_alpha: Noneoptim_target_modules: Nonebatch_eval_metrics: Falseeval_on_start: Falseuse_liger_kernel: Falseliger_kernel_config: Noneeval_use_gather_object: Falseaverage_tokens_across_devices: Trueprompts: Nonebatch_sampler: batch_samplermulti_dataset_batch_sampler: proportionalrouter_mapping: {}learning_rate_mapping: {}
Training Logs
Click to expand
| Epoch | Step | Training Loss | cellxgene pseudo bulk full cell sentence 1 caption loss | geo half cell sentence 1 caption loss | cellxgene_pseudo_bulk_full_cell_sentence_1_caption_cosine_accuracy | geo_half_cell_sentence_1_caption_cosine_accuracy |
|---|---|---|---|---|---|---|
| 0.1565 | 100 | 12.5906 | 11.7384 | 11.5454 | 0.5396 | 0.5698 |
| 0.3130 | 200 | 10.2096 | 9.0530 | 9.5828 | 0.6243 | 0.6534 |
| 0.4695 | 300 | 7.6353 | 6.9997 | 10.4301 | 0.7196 | 0.6659 |
| 0.6260 | 400 | 5.8509 | 6.0588 | 11.4104 | 0.7559 | 0.6726 |
| 0.7825 | 500 | 5.0719 | 5.9171 | 9.5449 | 0.7819 | 0.6830 |
| 0.9390 | 600 | 4.5624 | 5.8857 | 8.4883 | 0.8028 | 0.6891 |
| 1.0955 | 700 | 4.0923 | 6.2152 | 8.6582 | 0.8196 | 0.6938 |
| 1.2520 | 800 | 3.8628 | 6.0706 | 8.5598 | 0.8284 | 0.6975 |
| 1.4085 | 900 | 3.628 | 5.3656 | 8.8950 | 0.8399 | 0.6946 |
| 1.5649 | 1000 | 3.4405 | 5.4802 | 8.5998 | 0.8481 | 0.7067 |
| 1.7214 | 1100 | 3.2963 | 5.1315 | 10.0197 | 0.8540 | 0.7054 |
| 1.8779 | 1200 | 3.1722 | 5.2284 | 8.7489 | 0.8592 | 0.7137 |
| 2.0344 | 1300 | 3.0163 | 5.0619 | 9.2792 | 0.8658 | 0.7159 |
| 2.1909 | 1400 | 2.8775 | 5.1384 | 9.0896 | 0.8646 | 0.7224 |
| 2.3474 | 1500 | 2.7711 | 5.1064 | 9.1905 | 0.8721 | 0.7253 |
| 2.5039 | 1600 | 2.7079 | 4.8112 | 10.1273 | 0.8813 | 0.7219 |
| 2.6604 | 1700 | 2.6186 | 4.7861 | 9.4985 | 0.8853 | 0.7281 |
| 2.8169 | 1800 | 2.5678 | 4.7777 | 10.2713 | 0.8852 | 0.7266 |
| 2.9734 | 1900 | 2.5017 | 4.7162 | 9.4962 | 0.8851 | 0.7316 |
| 3.1299 | 2000 | 2.3918 | 5.1905 | 9.3873 | 0.8901 | 0.7330 |
| 3.2864 | 2100 | 2.3317 | 4.6239 | 10.0186 | 0.8941 | 0.7358 |
| 3.4429 | 2200 | 2.2891 | 4.6311 | 10.1845 | 0.8981 | 0.7357 |
| 3.5994 | 2300 | 2.2377 | 4.6785 | 9.6080 | 0.8948 | 0.7391 |
| 3.7559 | 2400 | 2.195 | 4.5995 | 9.9979 | 0.8983 | 0.7379 |
| 3.9124 | 2500 | 2.1639 | 4.8047 | 9.5234 | 0.9012 | 0.7401 |
| 4.0689 | 2600 | 2.1021 | 4.5940 | 10.8647 | 0.9017 | 0.7383 |
| 4.2254 | 2700 | 2.0352 | 4.8231 | 11.8600 | 0.9032 | 0.7369 |
| 4.3818 | 2800 | 2.0181 | 4.5308 | 10.6125 | 0.9016 | 0.7451 |
| 4.5383 | 2900 | 1.9876 | 4.6364 | 10.0304 | 0.9030 | 0.7434 |
| 4.6948 | 3000 | 1.9577 | 4.4448 | 10.7185 | 0.9058 | 0.7436 |
| 4.8513 | 3100 | 1.9296 | 4.3481 | 9.8166 | 0.9094 | 0.7472 |
| 5.0078 | 3200 | 1.9033 | 4.3892 | 10.2061 | 0.9101 | 0.7497 |
| 5.1643 | 3300 | 1.8417 | 4.6142 | 11.5901 | 0.9115 | 0.7423 |
| 5.3208 | 3400 | 1.8285 | 4.4177 | 10.9317 | 0.9125 | 0.7457 |
| 5.4773 | 3500 | 1.8018 | 4.8212 | 10.3529 | 0.9120 | 0.7492 |
| 5.6338 | 3600 | 1.7885 | 4.3228 | 10.3339 | 0.9115 | 0.7505 |
| 5.7903 | 3700 | 1.7715 | 4.5077 | 11.2075 | 0.9127 | 0.7502 |
| 5.9468 | 3800 | 1.7487 | 4.5082 | 11.5899 | 0.9144 | 0.7488 |
| 6.1033 | 3900 | 1.7096 | 4.4220 | 11.2094 | 0.9138 | 0.7528 |
| 6.2598 | 4000 | 1.6839 | 4.4874 | 11.3842 | 0.9147 | 0.7531 |
| 6.4163 | 4100 | 1.6681 | 4.5076 | 10.1323 | 0.9151 | 0.7552 |
| 6.5728 | 4200 | 1.6587 | 4.3866 | 10.9080 | 0.9176 | 0.7562 |
| 6.7293 | 4300 | 1.6442 | 4.5102 | 10.4440 | 0.9164 | 0.7567 |
| 6.8858 | 4400 | 1.633 | 4.3827 | 10.3677 | 0.9165 | 0.7566 |
| 7.0423 | 4500 | 1.6106 | 4.3742 | 10.4875 | 0.9168 | 0.7580 |
| 7.1987 | 4600 | 1.5724 | 4.3776 | 11.0099 | 0.9189 | 0.7599 |
| 7.3552 | 4700 | 1.569 | 4.4713 | 11.4858 | 0.9180 | 0.7596 |
| 7.5117 | 4800 | 1.5568 | 5.0983 | 12.8797 | 0.9180 | 0.7552 |
| 7.6682 | 4900 | 1.5504 | 4.5575 | 11.9515 | 0.9207 | 0.7568 |
| 7.8247 | 5000 | 1.5457 | 4.4049 | 11.0920 | 0.9176 | 0.7613 |
| 7.9812 | 5100 | 1.5289 | 4.3517 | 10.3765 | 0.9213 | 0.7654 |
| 8.1377 | 5200 | 1.4916 | 4.7450 | 10.8892 | 0.9190 | 0.7654 |
| 8.2942 | 5300 | 1.4827 | 4.3461 | 10.9710 | 0.9216 | 0.7649 |
| 8.4507 | 5400 | 1.4764 | 4.4988 | 11.6965 | 0.9214 | 0.7625 |
| 8.6072 | 5500 | 1.4695 | 4.7644 | 11.0618 | 0.9209 | 0.7660 |
| 8.7637 | 5600 | 1.4677 | 4.2997 | 10.9774 | 0.9214 | 0.7679 |
| 8.9202 | 5700 | 1.4601 | 4.5204 | 11.7339 | 0.9215 | 0.7635 |
| 9.0767 | 5800 | 1.4438 | 4.3946 | 10.8926 | 0.9216 | 0.7673 |
| 9.2332 | 5900 | 1.4122 | 4.4707 | 10.6763 | 0.9210 | 0.7685 |
| 9.3897 | 6000 | 1.4144 | 4.3983 | 10.6737 | 0.9219 | 0.7697 |
| 9.5462 | 6100 | 1.4151 | 4.5337 | 11.6132 | 0.9237 | 0.7665 |
| 9.7027 | 6200 | 1.4105 | 4.2718 | 10.9075 | 0.9240 | 0.7672 |
| 9.8592 | 6300 | 1.3998 | 4.2386 | 10.5633 | 0.9222 | 0.7709 |
| 10.0156 | 6400 | 1.3962 | 4.4400 | 11.2977 | 0.9221 | 0.7697 |
| 10.1721 | 6500 | 1.364 | 4.4761 | 10.8477 | 0.9238 | 0.7724 |
| 10.3286 | 6600 | 1.3633 | 4.3151 | 10.7404 | 0.9234 | 0.7757 |
| 10.4851 | 6700 | 1.363 | 4.4643 | 11.0058 | 0.9241 | 0.7738 |
| 10.6416 | 6800 | 1.3501 | 4.3482 | 11.3079 | 0.9241 | 0.7723 |
| 10.7981 | 6900 | 1.3587 | 5.5119 | 13.5821 | 0.9244 | 0.7640 |
| 10.9546 | 7000 | 1.3479 | 4.8425 | 11.0586 | 0.9239 | 0.7721 |
| 11.1111 | 7100 | 1.3279 | 4.5528 | 11.7557 | 0.9247 | 0.7712 |
| 11.2676 | 7200 | 1.3199 | 4.5110 | 10.8992 | 0.9246 | 0.7747 |
| 11.4241 | 7300 | 1.3248 | 5.6350 | 13.8206 | 0.9253 | 0.7626 |
| 11.5806 | 7400 | 1.3126 | 4.6457 | 12.2081 | 0.9259 | 0.7708 |
| 11.7371 | 7500 | 1.3021 | 4.2967 | 10.8623 | 0.9239 | 0.7752 |
| 11.8936 | 7600 | 1.3171 | 4.6930 | 12.1421 | 0.9258 | 0.7699 |
| 12.0501 | 7700 | 1.2997 | 4.4855 | 11.8060 | 0.9253 | 0.7713 |
| 12.2066 | 7800 | 1.2898 | 4.8945 | 11.2709 | 0.9251 | 0.7724 |
| 12.3631 | 7900 | 1.2815 | 4.2959 | 10.8704 | 0.9251 | 0.7745 |
| 12.5196 | 8000 | 1.2806 | 4.7454 | 11.4730 | 0.9262 | 0.7752 |
| 12.6761 | 8100 | 1.2775 | 5.5256 | 13.7529 | 0.9258 | 0.7672 |
| 12.8326 | 8200 | 1.279 | 4.4437 | 11.5125 | 0.9260 | 0.7758 |
| 12.9890 | 8300 | 1.2757 | 4.6046 | 11.4459 | 0.9260 | 0.7752 |
| 13.1455 | 8400 | 1.2596 | 5.1542 | 13.1028 | 0.9262 | 0.7710 |
| 13.3020 | 8500 | 1.255 | 4.5680 | 11.7233 | 0.9261 | 0.7750 |
| 13.4585 | 8600 | 1.2577 | 4.5955 | 11.9271 | 0.9262 | 0.7742 |
| 13.6150 | 8700 | 1.2542 | 5.6417 | 13.8272 | 0.9264 | 0.7679 |
| 13.7715 | 8800 | 1.2475 | 4.5411 | 11.3280 | 0.9260 | 0.7774 |
| 13.9280 | 8900 | 1.2568 | 4.5738 | 11.4883 | 0.9261 | 0.7768 |
| 14.0845 | 9000 | 1.2506 | 4.3740 | 11.1937 | 0.9268 | 0.7770 |
| 14.2410 | 9100 | 1.2463 | 4.6670 | 12.0231 | 0.9265 | 0.7743 |
| 14.3975 | 9200 | 1.2444 | 4.3910 | 11.2980 | 0.9271 | 0.7761 |
| 14.5540 | 9300 | 1.2361 | 4.8996 | 12.6156 | 0.9272 | 0.7732 |
| 14.7105 | 9400 | 1.2377 | 4.5226 | 11.4314 | 0.9263 | 0.7767 |
| 14.8670 | 9500 | 1.2315 | 4.4239 | 11.0899 | 0.9266 | 0.7785 |
| 15.0235 | 9600 | 1.2244 | 4.5566 | 11.7982 | 0.9266 | 0.7746 |
| 15.1800 | 9700 | 1.2233 | 4.4733 | 11.7135 | 0.9274 | 0.7765 |
| 15.3365 | 9800 | 1.2229 | 4.5923 | 12.0170 | 0.9272 | 0.7748 |
| 15.4930 | 9900 | 1.2317 | 4.5229 | 11.5002 | 0.9270 | 0.7768 |
| 15.6495 | 10000 | 1.2208 | 4.4253 | 11.3559 | 0.9270 | 0.7772 |
| 15.8059 | 10100 | 1.2275 | 4.5132 | 11.3865 | 0.9268 | 0.7767 |
| 15.9624 | 10200 | 1.2302 | 4.4903 | 11.1328 | 0.9267 | 0.7779 |
Framework Versions
- Python: 3.12.8
- Sentence Transformers: 5.1.2
- Transformers: 4.57.2
- PyTorch: 2.9.1+cu128
- Accelerate: 1.12.0
- Datasets: 3.6.0
- Tokenizers: 0.22.1
Citation
BibTeX
Sentence Transformers
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
Model tree for jo-mengr/mmcontext-pubmedbert-gs10k
Dataset used to train jo-mengr/mmcontext-pubmedbert-gs10k
Evaluation results
- Cosine Accuracy on cellxgene pseudo bulk full cell sentence 1 captionself-reported0.927
- Cosine Accuracy on geo half cell sentence 1 captionself-reported0.778