result_model / README.md
zakaria013's picture
<embeding>/hack_ai_embbedding_model
b00db10 verified
metadata
tags:
  - sentence-transformers
  - sentence-similarity
  - feature-extraction
  - generated_from_trainer
  - dataset_size:80
  - loss:CoSENTLoss
base_model: abdeljalilELmajjodi/model
widget:
  - source_sentence: Two blond women are hugging one another.
    sentences:
      - Some women are hugging on vacation.
      - The women are sleeping.
      - >-
        A blond man wearing a brown shirt is reading a book on a bench in the
        park
  - source_sentence: >-
      A few people in a restaurant setting, one of them is drinking orange
      juice.
    sentences:
      - An actress and her favorite assistant talk a walk in the city.
      - The adults are both male and female.
      - The diners are at a restaurant.
  - source_sentence: >-
      Two adults, one female in white, with shades and one male, gray clothes,
      walking across a street, away from a eatery with a blurred image of a dark
      colored red shirted person in the foreground.
    sentences:
      - >-
        Two adults walk across the street to get away from a red shirted person
        who is chasing them.
      - >-
        The friends have just met for the first time in 20 years, and have had a
        great time catching up.
      - An elderly man sits in a small shop.
  - source_sentence: People waiting to get on a train or just getting off.
    sentences:
      - There are people just getting on a train
      - A man and a woman walk down a crowded city street.
      - Two people walk home after a tasty steak dinner.
  - source_sentence: >-
      The school is having a special event in order to show the american culture
      on how other cultures are dealt with in parties.
    sentences:
      - A married couple is walking next to each other.
      - A man and a soman are eating together at John's Pizza and Gyro.
      - A school is hosting an event.
pipeline_tag: sentence-similarity
library_name: sentence-transformers
metrics:
  - pearson_cosine
  - spearman_cosine
model-index:
  - name: SentenceTransformer based on abdeljalilELmajjodi/model
    results:
      - task:
          type: semantic-similarity
          name: Semantic Similarity
        dataset:
          name: pair score evaluator dev
          type: pair-score-evaluator-dev
        metrics:
          - type: pearson_cosine
            value: -0.4586662223420623
            name: Pearson Cosine
          - type: spearman_cosine
            value: -0.5206512212946292
            name: Spearman Cosine

SentenceTransformer based on abdeljalilELmajjodi/model

This is a sentence-transformers model finetuned from abdeljalilELmajjodi/model on the all-nli dataset. It maps sentences & paragraphs to a 1024-dimensional dense vector space and can be used for retrieval.

Model Details

Model Description

  • Model Type: Sentence Transformer
  • Base model: abdeljalilELmajjodi/model
  • Maximum Sequence Length: 512 tokens
  • Output Dimensionality: 1024 dimensions
  • Similarity Function: Cosine Similarity
  • Supported Modality: Text
  • Training Dataset:
    • all-nli

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'transformer_task': 'feature-extraction', 'modality_config': {'text': {'method': 'forward', 'method_output_name': 'last_hidden_state'}}, 'module_output_name': 'token_embeddings', 'architecture': 'XLMRobertaModel'})
  (1): Pooling({'embedding_dimension': 1024, 'pooling_mode': 'mean', 'include_prompt': True})
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("sentence_transformers_model_id")
# Run inference
sentences = [
    'The school is having a special event in order to show the american culture on how other cultures are dealt with in parties.',
    'A school is hosting an event.',
    'A married couple is walking next to each other.',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 1024]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities)
# tensor([[1.0000, 0.9874, 0.9852],
#         [0.9874, 1.0000, 0.9922],
#         [0.9852, 0.9922, 1.0000]])

Evaluation

Metrics

Semantic Similarity

Metric Value
pearson_cosine -0.4587
spearman_cosine -0.5207

Training Details

Training Dataset

all-nli

  • Dataset: all-nli
  • Size: 80 training samples
  • Columns: sentence1, sentence2, and score
  • Approximate statistics based on the first 80 samples:
    sentence1 sentence2 score
    type string string float
    details
    • min: 10 tokens
    • mean: 24.96 tokens
    • max: 52 tokens
    • min: 5 tokens
    • mean: 11.64 tokens
    • max: 29 tokens
    • min: 0.0
    • mean: 0.5
    • max: 1.0
  • Samples:
    sentence1 sentence2 score
    A woman is walking across the street eating a banana, while a man is following with his briefcase. a woman eating a banana crosses a street 1.0
    A Little League team tries to catch a runner sliding into a base in an afternoon game. A team is playing baseball on Saturn. 0.0
    Woman in white in foreground and a man slightly behind walking with a sign for John's Pizza and Gyro in the background. The woman is waiting for a friend. 0.5
  • Loss: CoSENTLoss with these parameters:
    {
        "scale": 20.0,
        "similarity_fct": "pairwise_cos_sim"
    }
    

Evaluation Dataset

all-nli

  • Dataset: all-nli
  • Size: 20 evaluation samples
  • Columns: sentence1, sentence2, and score
  • Approximate statistics based on the first 20 samples:
    sentence1 sentence2 score
    type string string float
    details
    • min: 10 tokens
    • mean: 28.8 tokens
    • max: 52 tokens
    • min: 7 tokens
    • mean: 13.35 tokens
    • max: 25 tokens
    • min: 0.0
    • mean: 0.53
    • max: 1.0
  • Samples:
    sentence1 sentence2 score
    A woman wearing all white and eating, walks next to a man holding a briefcase. A married couple is walking next to each other. 0.5
    A woman in a green jacket and hood over her head looking towards a valley. The woman is wearing green. 1.0
    Two adults, one female in white, with shades and one male, gray clothes, walking across a street, away from a eatery with a blurred image of a dark colored red shirted person in the foreground. Two adults walking across a road near the convicted prisoner dressed in red 0.5
  • Loss: CoSENTLoss with these parameters:
    {
        "scale": 20.0,
        "similarity_fct": "pairwise_cos_sim"
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • num_train_epochs: 1
  • warmup_steps: 0.05
  • bf16: True
  • fp16_full_eval: True
  • load_best_model_at_end: True
  • push_to_hub: True
  • gradient_checkpointing: True

All Hyperparameters

Click to expand
  • do_predict: False
  • prediction_loss_only: True
  • per_device_train_batch_size: 8
  • per_device_eval_batch_size: 8
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 5e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1.0
  • num_train_epochs: 1
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: None
  • warmup_ratio: None
  • warmup_steps: 0.05
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • enable_jit_checkpoint: False
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • use_cpu: False
  • seed: 42
  • data_seed: None
  • bf16: True
  • fp16: False
  • bf16_full_eval: False
  • fp16_full_eval: True
  • tf32: None
  • local_rank: -1
  • ddp_backend: None
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: True
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • parallelism_config: None
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch_fused
  • optim_args: None
  • group_by_length: False
  • length_column_name: length
  • project: huggingface
  • trackio_space_id: trackio
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • push_to_hub: True
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: None
  • hub_always_push: False
  • hub_revision: None
  • gradient_checkpointing: True
  • gradient_checkpointing_kwargs: None
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • auto_find_batch_size: False
  • full_determinism: False
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • include_num_input_tokens_seen: no
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • liger_kernel_config: None
  • eval_use_gather_object: False
  • average_tokens_across_devices: True
  • use_cache: False
  • prompts: None
  • batch_sampler: batch_sampler
  • multi_dataset_batch_sampler: proportional
  • router_mapping: {}
  • learning_rate_mapping: {}

Training Logs

Epoch Step Training Loss Validation Loss pair-score-evaluator-dev_spearman_cosine
0.1 1 3.0301 - -
0.5 5 3.0538 - -
1.0 10 3.0176 2.468 -0.5207
  • The bold row denotes the saved checkpoint.

Training Time

  • Training: 3.0 minutes

Framework Versions

  • Python: 3.12.13
  • Sentence Transformers: 5.4.1
  • Transformers: 5.0.0
  • PyTorch: 2.10.0+cu128
  • Accelerate: 1.13.0
  • Datasets: 4.8.5
  • Tokenizers: 0.22.2

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

CoSENTLoss

@article{10531646,
    author={Huang, Xiang and Peng, Hao and Zou, Dongcheng and Liu, Zhiwei and Li, Jianxin and Liu, Kay and Wu, Jia and Su, Jianlin and Yu, Philip S.},
    journal={IEEE/ACM Transactions on Audio, Speech, and Language Processing},
    title={CoSENT: Consistent Sentence Embedding via Similarity Ranking},
    year={2024},
    doi={10.1109/TASLP.2024.3402087}
}