MMContext Model Information

This model uses a custom MMContextEncoder architecture for multimodal embedding generation, combining text and omics data representations.

⚠️ Important: Loading Instructions

This model requires trust_remote_code=True to load properly.

from sentence_transformers import SentenceTransformer

# ✅ CORRECT: Load with trust_remote_code=True
model = SentenceTransformer('jo-mengr/mmcontext-pubmedbert-gs10k-cxg', trust_remote_code=True)

# Generate embeddings
texts = ["Cell type annotation", "Another description"]
embeddings = model.encode(texts)
print(f"Embeddings shape: {embeddings.shape}")

Model Details

  • Architecture: MMContextEncoder (custom multimodal architecture)
  • Text Encoder: NeuML/pubmedbert-base-embeddings
  • Omics Embedding Method: gs10k
  • Output Dimension: 2048
  • Pooling Strategy: mean

Omics Embedding Method: GS10K

Gene Set enrichment-based embeddings (10k genes)

Usage Tutorial

📓 Tutorial Notebook: usage_tutorial.ipynb - Detailed usage examples and best practices

Model Architecture

The MMContextEncoder combines:

  • Text Branch: NeuML/pubmedbert-base-embeddings with optional adapter layers
  • Omics Branch: Lookup-based encoder with precomputed gs10k embeddings
  • Adapters: Feed-forward projection layers for dimensionality alignment
  • Pooling: mean pooling for sentence-level embeddings

Files in this Repository

  • mmcontextencoder.py: Main model implementation
  • adapters.py: Adapter modules for dimensionality mapping
  • omicsencoder.py: Omics data encoder
  • onehot.py: One-hot text encoder
  • file_utils.py: Utility functions
  • usage_tutorial.ipynb: Tutorial notebook with usage examples

Training Details

  • Text Encoder: NeuML/pubmedbert-base-embeddings
  • Embedding Method: gs10k
  • Output Dimension: 2048
  • Training Datasets: 2 datasets
  • Text-only Datasets: 0 (None)
  • Numeric Datasets: 2 (cellxgene_pseudo_bulk_full, geo_half)
  • Batch Size: 512
  • Learning Rate: 2e-05
  • Training Epochs: 16

This model was trained using the MMContext framework for multimodal single-cell analysis.


SentenceTransformer based on NeuML/pubmedbert-base-embeddings

This is a sentence-transformers model finetuned from NeuML/pubmedbert-base-embeddings on the cellxgene_pseudo_bulk_full_cell_sentence_1_caption and geo_half_cell_sentence_1_caption datasets. It maps sentences & paragraphs to a 2048-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): MMContextEncoder(
    (text_encoder): BertModel(
      (embeddings): BertEmbeddings(
        (word_embeddings): Embedding(30522, 768, padding_idx=0)
        (position_embeddings): Embedding(512, 768)
        (token_type_embeddings): Embedding(2, 768)
        (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
        (dropout): Dropout(p=0.1, inplace=False)
      )
      (encoder): BertEncoder(
        (layer): ModuleList(
          (0-11): 12 x BertLayer(
            (attention): BertAttention(
              (self): BertSdpaSelfAttention(
                (query): Linear(in_features=768, out_features=768, bias=True)
                (key): Linear(in_features=768, out_features=768, bias=True)
                (value): Linear(in_features=768, out_features=768, bias=True)
                (dropout): Dropout(p=0.1, inplace=False)
              )
              (output): BertSelfOutput(
                (dense): Linear(in_features=768, out_features=768, bias=True)
                (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
                (dropout): Dropout(p=0.1, inplace=False)
              )
            )
            (intermediate): BertIntermediate(
              (dense): Linear(in_features=768, out_features=3072, bias=True)
              (intermediate_act_fn): GELUActivation()
            )
            (output): BertOutput(
              (dense): Linear(in_features=3072, out_features=768, bias=True)
              (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
              (dropout): Dropout(p=0.1, inplace=False)
            )
          )
        )
      )
      (pooler): BertPooler(
        (dense): Linear(in_features=768, out_features=768, bias=True)
        (activation): Tanh()
      )
    )
    (text_adapter): AdapterModule(
      (net): Sequential(
        (0): Linear(in_features=768, out_features=1024, bias=True)
        (1): ReLU(inplace=True)
        (2): Linear(in_features=1024, out_features=2048, bias=True)
        (3): BatchNorm1d(2048, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      )
    )
    (pooling): Pooling({'word_embedding_dimension': 2048, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
    (omics_adapter): AdapterModule(
      (net): Sequential(
        (0): Linear(in_features=10000, out_features=1024, bias=True)
        (1): ReLU(inplace=True)
        (2): Linear(in_features=1024, out_features=2048, bias=True)
        (3): BatchNorm1d(2048, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      )
    )
    (omics_encoder): MiniOmicsModel(
      (embeddings): Embedding(726794, 10000, padding_idx=0)
    )
  )
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("sentence_transformers_model_id")
# Run inference
sentences = [
    'sample_idx:census_367b55f4-d543-49aa-90e8-4765fcb8c687_969',
    "This measurement was conducted with 10x 3' v3. Neuron cell type from the thalamic complex, specifically the centromedian and parafasicular nuclei (CM and Pf), derived from a 42-year old male.",
    "This measurement was conducted with 10x 3' v3. Neuron cell type from a 42-year-old male, specifically from the thalamic complex with thalamic excitatory supercluster term, corresponding to the Thalamus (THM) - intralaminar nuclear complex (ILN) - posterior group of intralaminar nuclei (PILN) - centromedian and parafasicular nuclei - CM and Pf dissection.",
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 2048]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities)
# tensor([[1.0000, 0.5200, 0.4699],
#         [0.5200, 1.0000, 0.8290],
#         [0.4699, 0.8290, 1.0000]])

Evaluation

Metrics

Triplet

  • Datasets: cellxgene_pseudo_bulk_full_cell_sentence_1_caption and geo_half_cell_sentence_1_caption
  • Evaluated with TripletEvaluator
Metric cellxgene_pseudo_bulk_full_cell_sentence_1_caption geo_half_cell_sentence_1_caption
cosine_accuracy 0.9267 0.7779

Training Details

Training Datasets

cellxgene_pseudo_bulk_full_cell_sentence_1_caption

  • Dataset: cellxgene_pseudo_bulk_full_cell_sentence_1_caption at 55717c1
  • Size: 306,003 training samples
  • Columns: anchor, positive, negative_1, and negative_2
  • Approximate statistics based on the first 1000 samples:
    anchor positive negative_1 negative_2
    type string string string string
    details
    • min: 56 characters
    • mean: 58.69 characters
    • max: 60 characters
    • min: 22 tokens
    • mean: 47.33 tokens
    • max: 165 tokens
    • min: 22 tokens
    • mean: 49.45 tokens
    • max: 120 tokens
    • min: 56 characters
    • mean: 58.63 characters
    • max: 59 characters
  • Samples:
    anchor positive negative_1 negative_2
    sample_idx:census_9d5df009-eb76-43a3-b6cd-22017cc53700_231 This measurement was conducted with 10x 3' v3. Gut endothelial cell derived from proximal colon of a male human fetus at 13th week post-fertilization stage. This measurement was conducted with 10x 3' v3. Mesothelial cell derived from the proximal colon of a male human at 23rd week post-fertilization stage. sample_idx:census_9d5df009-eb76-43a3-b6cd-22017cc53700_521
    sample_idx:census_367b55f4-d543-49aa-90e8-4765fcb8c687_132 This measurement was conducted with 10x 3' v3. Sample is an oligodendrocyte cell from a 29-year-old male human, specifically from the thalamic complex, with European self-reported ethnicity. This measurement was conducted with 10x 3' v3. Neuron cell type from the thalamic complex, specifically the centromedian and parafasicular nuclei (CM and Pf), derived from a 42-year old male human donor. sample_idx:census_367b55f4-d543-49aa-90e8-4765fcb8c687_134
    sample_idx:census_1e6a6ef9-7ec9-4c90-bbfb-2ad3c3165fd1_9964 This measurement was conducted with Smart-seq2. Neutrophil cell type derived from the lung tissue of a 37-year old male with advanced stage non-small cell lung cancer (NSCLC), stage IV, who has never smoked. The cells exhibit an ALK mutation, with no mutations detected in BRAF, EGFR, ERBB2, KRAS, ROS, or TP53. This measurement was conducted with 10x 3' v2. Myeloid cell derived from the lung tissue of a 65-year old male, located in normal adjacent tissue, with advanced non-small cell lung cancer (NSCLC), stage III. sample_idx:census_1e6a6ef9-7ec9-4c90-bbfb-2ad3c3165fd1_972
  • Loss: mmcontext.utils.PerDatasetLossLogger

geo_half_cell_sentence_1_caption

  • Dataset: geo_half_cell_sentence_1_caption at bc13ae5
  • Size: 348,046 training samples
  • Columns: anchor, positive, negative_1, and negative_2
  • Approximate statistics based on the first 1000 samples:
    anchor positive negative_1 negative_2
    type string string string string
    details
    • min: 20 characters
    • mean: 20.0 characters
    • max: 20 characters
    • min: 17 tokens
    • mean: 36.82 tokens
    • max: 130 tokens
    • min: 19 tokens
    • mean: 34.25 tokens
    • max: 88 tokens
    • min: 20 characters
    • mean: 20.0 characters
    • max: 20 characters
  • Samples:
    anchor positive negative_1 negative_2
    sample_idx:SRX173216 This measurement was conducted with Illumina HiSeq 2000. B-cells from individual GM12004, assayed using global run-on technique. These are primary cells, with no reported treatment. This measurement was conducted with Illumina HiSeq 2000. 48 hour Activin treatment of H1 embryonic stem cells. sample_idx:SRX189728
    sample_idx:SRX185041 This measurement was conducted with Illumina HiSeq 2000. 1000 ng of fragmented total RNA from a cultured chronic myelogenous leukemia (CML) cell line, specifically the human CML cell line K-562. This cell line is derived from a female hematological system disease, specifically a lymphoid neoplasm (leukemia) known as C.M.L., which is a type of neoplasm affecting the bone marrow. The sample has not undergone any treatment. This measurement was conducted with Illumina HiSeq 2000. 1000 ng of fragmented total RNA from a cultured female human Chronic Myelogenous Leukemia (CML) cell line, K-562, which was grown in tissue culture. The sample has not received any treatment. sample_idx:SRX185051
    sample_idx:SRX185046 This measurement was conducted with Illumina HiSeq 2000. 1000 ng of fragmented total RNA from a cultured female human Chronic Myelogenous Leukemia (CML) cell line, K-562, which was grown in tissue culture. The sample has not received any treatment. This measurement was conducted with Illumina HiSeq 2000. 1000 ng of fragmented total RNA from a cultured chronic myelogenous leukemia (CML) cell line, specifically the human CML cell line K-562. This cell line is derived from a female hematological system disease, specifically a lymphoid neoplasm (leukemia) known as C.M.L., which is a type of neoplasm affecting the bone marrow. The sample has not undergone any treatment. sample_idx:SRX185051
  • Loss: mmcontext.utils.PerDatasetLossLogger

Evaluation Datasets

cellxgene_pseudo_bulk_full_cell_sentence_1_caption

  • Dataset: cellxgene_pseudo_bulk_full_cell_sentence_1_caption at 55717c1
  • Size: 33,937 evaluation samples
  • Columns: anchor, positive, negative_1, and negative_2
  • Approximate statistics based on the first 1000 samples:
    anchor positive negative_1 negative_2
    type string string string string
    details
    • min: 56 characters
    • mean: 58.7 characters
    • max: 60 characters
    • min: 21 tokens
    • mean: 47.32 tokens
    • max: 147 tokens
    • min: 21 tokens
    • mean: 44.03 tokens
    • max: 88 tokens
    • min: 56 characters
    • mean: 58.76 characters
    • max: 60 characters
  • Samples:
    anchor positive negative_1 negative_2
    sample_idx:census_7db0c178-b0a4-442f-ba54-e9e1633a84bb_763 This measurement was conducted with 10x 3' v3. Oligodendrocyte cell sample taken from the cerebral cortex (Cx) of a 42-year-old male, specifically from the human A43 region. This measurement was conducted with 10x 3' v3. Neuron cell type from a 50-year old male, specifically an MGE interneuron, located in the cerebral cortex, parietal operculum, gustatory cortex, A43 region. sample_idx:census_7db0c178-b0a4-442f-ba54-e9e1633a84bb_541
    sample_idx:census_1e6a6ef9-7ec9-4c90-bbfb-2ad3c3165fd1_18907 This measurement was conducted with 10x 3' v2. Endothelial cell, specifically a vein endothelial cell, derived from normal adjacent lung tissue of a 71-year-old female patient with early stage NSCLC (stage I) who has a history of smoking. This measurement was conducted with 10x 3' v2. Endothelial cell derived from the lymphatic vessel of a 69-year-old male with early stage non-small cell lung cancer (NSCLC), stage II. The patient has a history of smoking and the cell was obtained from the primary tumor site. sample_idx:census_1e6a6ef9-7ec9-4c90-bbfb-2ad3c3165fd1_16402
    sample_idx:census_fd072bc3-2dfb-46f8-b4e3-467cb3223182_3695 This measurement was conducted with 10x 3' v2. Endothelial cells collected from the spleen of a male human fetus at 15 weeks post-fertilization. This measurement was conducted with 10x 5' v1. A native cell from the skin of a female human fetus at 11 weeks post-fertilization, identified as a doublet of endothelial and erythrocyte lineage. sample_idx:census_fd072bc3-2dfb-46f8-b4e3-467cb3223182_6661
  • Loss: mmcontext.utils.PerDatasetLossLogger

geo_half_cell_sentence_1_caption

  • Dataset: geo_half_cell_sentence_1_caption at bc13ae5
  • Size: 38,807 evaluation samples
  • Columns: anchor, positive, negative_1, and negative_2
  • Approximate statistics based on the first 1000 samples:
    anchor positive negative_1 negative_2
    type string string string string
    details
    • min: 20 characters
    • mean: 20.19 characters
    • max: 21 characters
    • min: 16 tokens
    • mean: 37.63 tokens
    • max: 119 tokens
    • min: 16 tokens
    • mean: 54.71 tokens
    • max: 111 tokens
    • min: 20 characters
    • mean: 20.04 characters
    • max: 21 characters
  • Samples:
    anchor positive negative_1 negative_2
    sample_idx:SRX185061 This measurement was conducted with Illumina HiSeq 2000. 1000 ng of fragmented total RNA from a cultured chronic myelogenous leukemia (CML) cell line (K-562) derived from a female hematological system disease (CML). The cells were grown in tissue culture and have undergone Ribo-Zero treatment. This measurement was conducted with Illumina HiSeq 1000. The sample is a cell line (OCI-LY1) derived from a diffuse large B-cell lymphoma (DLBCL), a type of non-Hodgkin lymphoma that affects the lymphatic system. The cells have been treated with siNT (a non-coding siRNA) for 48 hours. sample_idx:SRX188847
    sample_idx:SRX185895 This measurement was conducted with Illumina HiSeq 1000. The sample is a cell line (OCI-LY1) derived from a diffuse large B-cell lymphoma (DLBCL), a type of non-Hodgkin lymphoma that affects the lymphatic system. The cells have been treated with siNT (a non-coding siRNA) for 48 hours. This measurement was conducted with Illumina HiSeq 2000. 1000 ng of fragmented total RNA from a cultured chronic myelogenous leukemia (CML) cell line (K-562) derived from a female hematological system disease (CML). The cells were grown in tissue culture and have undergone Ribo-Zero treatment. sample_idx:SRX188847
    sample_idx:SRX188847 This measurement was conducted with Illumina HiSeq 2000. 48-hour Activin-treated H1 embryonic stem cells. This measurement was conducted with Illumina HiSeq 2000. 1000 ng of fragmented total RNA from a cultured chronic myelogenous leukemia (CML) cell line (K-562) derived from a female hematological system disease (CML). The cells were grown in tissue culture and have undergone Ribo-Zero treatment. sample_idx:SRX185895
  • Loss: mmcontext.utils.PerDatasetLossLogger

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: steps
  • per_device_train_batch_size: 512
  • per_device_eval_batch_size: 512
  • learning_rate: 2e-05
  • num_train_epochs: 16
  • warmup_ratio: 0.1
  • bf16: True

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: steps
  • prediction_loss_only: True
  • per_device_train_batch_size: 512
  • per_device_eval_batch_size: 512
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 2e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1.0
  • num_train_epochs: 16
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.1
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • bf16: True
  • fp16: False
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: False
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • parallelism_config: None
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch_fused
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • project: huggingface
  • trackio_space_id: trackio
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: None
  • hub_always_push: False
  • hub_revision: None
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: no
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • liger_kernel_config: None
  • eval_use_gather_object: False
  • average_tokens_across_devices: True
  • prompts: None
  • batch_sampler: batch_sampler
  • multi_dataset_batch_sampler: proportional
  • router_mapping: {}
  • learning_rate_mapping: {}

Training Logs

Click to expand
Epoch Step Training Loss cellxgene pseudo bulk full cell sentence 1 caption loss geo half cell sentence 1 caption loss cellxgene_pseudo_bulk_full_cell_sentence_1_caption_cosine_accuracy geo_half_cell_sentence_1_caption_cosine_accuracy
0.1565 100 12.5906 11.7384 11.5454 0.5396 0.5698
0.3130 200 10.2096 9.0530 9.5828 0.6243 0.6534
0.4695 300 7.6353 6.9997 10.4301 0.7196 0.6659
0.6260 400 5.8509 6.0588 11.4104 0.7559 0.6726
0.7825 500 5.0719 5.9171 9.5449 0.7819 0.6830
0.9390 600 4.5624 5.8857 8.4883 0.8028 0.6891
1.0955 700 4.0923 6.2152 8.6582 0.8196 0.6938
1.2520 800 3.8628 6.0706 8.5598 0.8284 0.6975
1.4085 900 3.628 5.3656 8.8950 0.8399 0.6946
1.5649 1000 3.4405 5.4802 8.5998 0.8481 0.7067
1.7214 1100 3.2963 5.1315 10.0197 0.8540 0.7054
1.8779 1200 3.1722 5.2284 8.7489 0.8592 0.7137
2.0344 1300 3.0163 5.0619 9.2792 0.8658 0.7159
2.1909 1400 2.8775 5.1384 9.0896 0.8646 0.7224
2.3474 1500 2.7711 5.1064 9.1905 0.8721 0.7253
2.5039 1600 2.7079 4.8112 10.1273 0.8813 0.7219
2.6604 1700 2.6186 4.7861 9.4985 0.8853 0.7281
2.8169 1800 2.5678 4.7777 10.2713 0.8852 0.7266
2.9734 1900 2.5017 4.7162 9.4962 0.8851 0.7316
3.1299 2000 2.3918 5.1905 9.3873 0.8901 0.7330
3.2864 2100 2.3317 4.6239 10.0186 0.8941 0.7358
3.4429 2200 2.2891 4.6311 10.1845 0.8981 0.7357
3.5994 2300 2.2377 4.6785 9.6080 0.8948 0.7391
3.7559 2400 2.195 4.5995 9.9979 0.8983 0.7379
3.9124 2500 2.1639 4.8047 9.5234 0.9012 0.7401
4.0689 2600 2.1021 4.5940 10.8647 0.9017 0.7383
4.2254 2700 2.0352 4.8231 11.8600 0.9032 0.7369
4.3818 2800 2.0181 4.5308 10.6125 0.9016 0.7451
4.5383 2900 1.9876 4.6364 10.0304 0.9030 0.7434
4.6948 3000 1.9577 4.4448 10.7185 0.9058 0.7436
4.8513 3100 1.9296 4.3481 9.8166 0.9094 0.7472
5.0078 3200 1.9033 4.3892 10.2061 0.9101 0.7497
5.1643 3300 1.8417 4.6142 11.5901 0.9115 0.7423
5.3208 3400 1.8285 4.4177 10.9317 0.9125 0.7457
5.4773 3500 1.8018 4.8212 10.3529 0.9120 0.7492
5.6338 3600 1.7885 4.3228 10.3339 0.9115 0.7505
5.7903 3700 1.7715 4.5077 11.2075 0.9127 0.7502
5.9468 3800 1.7487 4.5082 11.5899 0.9144 0.7488
6.1033 3900 1.7096 4.4220 11.2094 0.9138 0.7528
6.2598 4000 1.6839 4.4874 11.3842 0.9147 0.7531
6.4163 4100 1.6681 4.5076 10.1323 0.9151 0.7552
6.5728 4200 1.6587 4.3866 10.9080 0.9176 0.7562
6.7293 4300 1.6442 4.5102 10.4440 0.9164 0.7567
6.8858 4400 1.633 4.3827 10.3677 0.9165 0.7566
7.0423 4500 1.6106 4.3742 10.4875 0.9168 0.7580
7.1987 4600 1.5724 4.3776 11.0099 0.9189 0.7599
7.3552 4700 1.569 4.4713 11.4858 0.9180 0.7596
7.5117 4800 1.5568 5.0983 12.8797 0.9180 0.7552
7.6682 4900 1.5504 4.5575 11.9515 0.9207 0.7568
7.8247 5000 1.5457 4.4049 11.0920 0.9176 0.7613
7.9812 5100 1.5289 4.3517 10.3765 0.9213 0.7654
8.1377 5200 1.4916 4.7450 10.8892 0.9190 0.7654
8.2942 5300 1.4827 4.3461 10.9710 0.9216 0.7649
8.4507 5400 1.4764 4.4988 11.6965 0.9214 0.7625
8.6072 5500 1.4695 4.7644 11.0618 0.9209 0.7660
8.7637 5600 1.4677 4.2997 10.9774 0.9214 0.7679
8.9202 5700 1.4601 4.5204 11.7339 0.9215 0.7635
9.0767 5800 1.4438 4.3946 10.8926 0.9216 0.7673
9.2332 5900 1.4122 4.4707 10.6763 0.9210 0.7685
9.3897 6000 1.4144 4.3983 10.6737 0.9219 0.7697
9.5462 6100 1.4151 4.5337 11.6132 0.9237 0.7665
9.7027 6200 1.4105 4.2718 10.9075 0.9240 0.7672
9.8592 6300 1.3998 4.2386 10.5633 0.9222 0.7709
10.0156 6400 1.3962 4.4400 11.2977 0.9221 0.7697
10.1721 6500 1.364 4.4761 10.8477 0.9238 0.7724
10.3286 6600 1.3633 4.3151 10.7404 0.9234 0.7757
10.4851 6700 1.363 4.4643 11.0058 0.9241 0.7738
10.6416 6800 1.3501 4.3482 11.3079 0.9241 0.7723
10.7981 6900 1.3587 5.5119 13.5821 0.9244 0.7640
10.9546 7000 1.3479 4.8425 11.0586 0.9239 0.7721
11.1111 7100 1.3279 4.5528 11.7557 0.9247 0.7712
11.2676 7200 1.3199 4.5110 10.8992 0.9246 0.7747
11.4241 7300 1.3248 5.6350 13.8206 0.9253 0.7626
11.5806 7400 1.3126 4.6457 12.2081 0.9259 0.7708
11.7371 7500 1.3021 4.2967 10.8623 0.9239 0.7752
11.8936 7600 1.3171 4.6930 12.1421 0.9258 0.7699
12.0501 7700 1.2997 4.4855 11.8060 0.9253 0.7713
12.2066 7800 1.2898 4.8945 11.2709 0.9251 0.7724
12.3631 7900 1.2815 4.2959 10.8704 0.9251 0.7745
12.5196 8000 1.2806 4.7454 11.4730 0.9262 0.7752
12.6761 8100 1.2775 5.5256 13.7529 0.9258 0.7672
12.8326 8200 1.279 4.4437 11.5125 0.9260 0.7758
12.9890 8300 1.2757 4.6046 11.4459 0.9260 0.7752
13.1455 8400 1.2596 5.1542 13.1028 0.9262 0.7710
13.3020 8500 1.255 4.5680 11.7233 0.9261 0.7750
13.4585 8600 1.2577 4.5955 11.9271 0.9262 0.7742
13.6150 8700 1.2542 5.6417 13.8272 0.9264 0.7679
13.7715 8800 1.2475 4.5411 11.3280 0.9260 0.7774
13.9280 8900 1.2568 4.5738 11.4883 0.9261 0.7768
14.0845 9000 1.2506 4.3740 11.1937 0.9268 0.7770
14.2410 9100 1.2463 4.6670 12.0231 0.9265 0.7743
14.3975 9200 1.2444 4.3910 11.2980 0.9271 0.7761
14.5540 9300 1.2361 4.8996 12.6156 0.9272 0.7732
14.7105 9400 1.2377 4.5226 11.4314 0.9263 0.7767
14.8670 9500 1.2315 4.4239 11.0899 0.9266 0.7785
15.0235 9600 1.2244 4.5566 11.7982 0.9266 0.7746
15.1800 9700 1.2233 4.4733 11.7135 0.9274 0.7765
15.3365 9800 1.2229 4.5923 12.0170 0.9272 0.7748
15.4930 9900 1.2317 4.5229 11.5002 0.9270 0.7768
15.6495 10000 1.2208 4.4253 11.3559 0.9270 0.7772
15.8059 10100 1.2275 4.5132 11.3865 0.9268 0.7767
15.9624 10200 1.2302 4.4903 11.1328 0.9267 0.7779

Framework Versions

  • Python: 3.12.8
  • Sentence Transformers: 5.1.2
  • Transformers: 4.57.2
  • PyTorch: 2.9.1+cu128
  • Accelerate: 1.12.0
  • Datasets: 3.6.0
  • Tokenizers: 0.22.1

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for jo-mengr/mmcontext-pubmedbert-gs10k

Dataset used to train jo-mengr/mmcontext-pubmedbert-gs10k

Evaluation results

  • Cosine Accuracy on cellxgene pseudo bulk full cell sentence 1 caption
    self-reported
    0.927
  • Cosine Accuracy on geo half cell sentence 1 caption
    self-reported
    0.778