msugimura's picture
Upload checkpoint-54/README.md
a9b868f verified
metadata
tags:
  - sentence-transformers
  - sentence-similarity
  - feature-extraction
  - generated_from_trainer
  - dataset_size:849
  - loss:MultipleNegativesRankingLoss
base_model: sentence-transformers/all-MiniLM-L12-v2
widget:
  - source_sentence: >-
      Graphic designer who specializes in creating visual content for brands,
      including logos, marketing materials, and user interfaces. Focuses on
      aesthetics, user experience, and brand identity.
    sentences:
      - >-
        user_1: I'm looking to refresh my company's brand image but don't know
        where to start.

        user_2: You should consult a brand manager.
      - |-
        user_1: I need help designing a logo for my new business.
        user_2: Have you thought about hiring a graphic designer?
        user_1: Yes, I want something that really represents my brand.
      - |-
        user_1: My car's making a weird noise, and I don't know what to do.
        user_2: You should take it to a mechanic.
  - source_sentence: >-
      Nutritionist who specializes in dietary planning and nutritional
      counseling. Helps clients achieve their health goals through personalized
      meal plans and education.
    sentences:
      - |-
        user_1: I'm trying to lose weight but I don't know what to eat.
        user_2: Have you considered talking to a nutritionist?
      - |-
        user_1: Our database is running slow, and I don't know why.
        user_2: Have you checked the indexing?
      - |-
        user_1: I need help fixing my car's engine; it's making a weird noise.
        user_2: Have you checked the oil level?
  - source_sentence: 'user_2: Sure, what problem are you working on?'
    sentences:
      - >-
        Gardening expert specializing in vegetable gardening techniques and
        plant care.
      - Event planner focusing on corporate events and wedding coordination.
      - >-
        Math tutor specializing in teaching and clarifying mathematical concepts
        and problem-solving.
  - source_sentence: 'user_2: Have you thought about getting some storage bins?'
    sentences:
      - Web developer focused on software engineering and application design.
      - >-
        Professional organizer specializing in home organization and
        decluttering strategies.
      - >-
        Pet behavior specialist who provides advice on dog breeds and training
        for small living spaces.
  - source_sentence: 'user_1: Maybe the national parks, I want to see some nature.'
    sentences:
      - >-
        Mental health counselor specializing in stress management and coping
        strategies.
      - Data analyst focusing on market trends and business intelligence.
      - >-
        Travel consultant specializing in road trip planning and national park
        itineraries.
pipeline_tag: sentence-similarity
library_name: sentence-transformers

SentenceTransformer based on sentence-transformers/all-MiniLM-L12-v2

This is a sentence-transformers model finetuned from sentence-transformers/all-MiniLM-L12-v2 on the semantic_triplets_round1 and inverse_semantic_triplets datasets. It maps sentences & paragraphs to a 384-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

  • Model Type: Sentence Transformer
  • Base model: sentence-transformers/all-MiniLM-L12-v2
  • Maximum Sequence Length: 128 tokens
  • Output Dimensionality: 384 dimensions
  • Similarity Function: Cosine Similarity
  • Training Datasets:
    • semantic_triplets_round1
    • inverse_semantic_triplets

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 128, 'do_lower_case': False}) with Transformer model: BertModel 
  (1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
  (2): Normalize()
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("sentence_transformers_model_id")
# Run inference
sentences = [
    'user_1: Maybe the national parks, I want to see some nature.',
    'Travel consultant specializing in road trip planning and national park itineraries.',
    'Data analyst focusing on market trends and business intelligence.',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 384]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Training Details

Training Datasets

semantic_triplets_round1

  • Dataset: semantic_triplets_round1
  • Size: 422 training samples
  • Columns: anchor, positive, and negative
  • Approximate statistics based on the first 422 samples:
    anchor positive negative
    type string string string
    details
    • min: 10 tokens
    • mean: 17.44 tokens
    • max: 33 tokens
    • min: 11 tokens
    • mean: 14.17 tokens
    • max: 26 tokens
    • min: 9 tokens
    • mean: 12.49 tokens
    • max: 20 tokens
  • Samples:
    anchor positive negative
    user_1: Can anyone recommend a good app for tracking my expenses? Personal finance advisor specializing in budgeting tools and expense tracking applications. Fitness instructor focusing on workout plans and nutrition.
    user_1: Can anyone recommend a good workout routine for beginners? Fitness trainer who specializes in creating beginner workout plans and exercise coaching. Financial advisor focused on investment strategies and retirement planning.
    user_2: What kind of vegetables are you thinking of planting? Gardening expert who provides guidance on vegetable gardening techniques and plant care. Investment advisor specializing in stock market strategies and financial planning.
  • Loss: MultipleNegativesRankingLoss with these parameters:
    {
        "scale": 20.0,
        "similarity_fct": "cos_sim"
    }
    

inverse_semantic_triplets

  • Dataset: inverse_semantic_triplets
  • Size: 427 training samples
  • Columns: anchor, positive, and negative
  • Approximate statistics based on the first 427 samples:
    anchor positive negative
    type string string string
    details
    • min: 18 tokens
    • mean: 28.42 tokens
    • max: 46 tokens
    • min: 19 tokens
    • mean: 40.04 tokens
    • max: 72 tokens
    • min: 13 tokens
    • mean: 27.66 tokens
    • max: 62 tokens
  • Samples:
    anchor positive negative
    UX researcher specializing in user experience design and user testing. Conducts research to understand user needs and improve product usability. user_1: I'm looking for ways to improve the usability of our app.
    user_2: Have you considered conducting user interviews?
    user_1: I need to plan a trip to Europe next summer.
    user_2: What countries are you thinking about visiting?
    Software developer specializing in web applications, proficient in various programming languages and frameworks. I design, develop, and maintain software solutions, focusing on user experience and functionality. user_1: I'm trying to build a web application, but I'm stuck on how to integrate the backend with the frontend.
    user_2: What technologies are you using for both?
    user_1: I’m using Node.js for the backend and React for the frontend.
    user_1: I'm looking for a good recipe for chocolate chip cookies.
    user_2: I can share my favorite one!
    Marketing strategist who focuses on developing comprehensive marketing plans to drive brand engagement and sales growth. Specializes in digital marketing and content strategy. user_1: I'm launching a new product and need a marketing strategy.
    user_2: Have you set any goals for your campaign?
    user_1: I'm looking for a new pair of running shoes.
    user_2: What brand do you prefer?
  • Loss: MultipleNegativesRankingLoss with these parameters:
    {
        "scale": 20.0,
        "similarity_fct": "cos_sim"
    }
    

Evaluation Datasets

semantic_triplets_round1

  • Dataset: semantic_triplets_round1
  • Size: 47 evaluation samples
  • Columns: anchor, positive, and negative
  • Approximate statistics based on the first 47 samples:
    anchor positive negative
    type string string string
    details
    • min: 12 tokens
    • mean: 17.87 tokens
    • max: 30 tokens
    • min: 8 tokens
    • mean: 14.32 tokens
    • max: 27 tokens
    • min: 10 tokens
    • mean: 12.49 tokens
    • max: 16 tokens
  • Samples:
    anchor positive negative
    user_1: What's the best way to train my puppy to stop barking? Dog training specialist focused on behavioral issues and obedience training. Financial advisor who specializes in investment strategies and wealth management.
    user_2: What vegetables do you want to grow? Gardening expert specializing in vegetable gardening and sustainable practices. Real estate agent focusing on home buying and selling.
    user_1: Anyone have tips on how to improve my running time for a 5k? Running coach specializing in training plans and performance improvement. Financial advisor focusing on investment strategies and retirement planning.
  • Loss: MultipleNegativesRankingLoss with these parameters:
    {
        "scale": 20.0,
        "similarity_fct": "cos_sim"
    }
    

inverse_semantic_triplets

  • Dataset: inverse_semantic_triplets
  • Size: 48 evaluation samples
  • Columns: anchor, positive, and negative
  • Approximate statistics based on the first 48 samples:
    anchor positive negative
    type string string string
    details
    • min: 20 tokens
    • mean: 28.42 tokens
    • max: 38 tokens
    • min: 23 tokens
    • mean: 39.71 tokens
    • max: 65 tokens
    • min: 14 tokens
    • mean: 28.4 tokens
    • max: 52 tokens
  • Samples:
    anchor positive negative
    Graphic designer who specializes in creating visual content for brands, including logos, marketing materials, and user interfaces. Focuses on aesthetics, user experience, and brand identity. user_1: I need help designing a logo for my new business.
    user_2: Have you thought about hiring a graphic designer?
    user_1: Yes, I want something that really represents my brand.
    user_1: My car's making a weird noise, and I don't know what to do.
    user_2: You should take it to a mechanic.
    Physical therapist specializing in rehabilitation for sports injuries, pain management, and improving mobility through tailored exercise programs. user_1: I twisted my ankle playing basketball, and it's really swollen.
    user_2: Have you seen a doctor about it?
    user_1: I'm thinking of redecorating my living room.
    user_2: What style are you going for?
    An accountant who specializes in financial record-keeping, tax preparation, and business consulting. Provides services to help clients manage their finances effectively and ensure compliance with tax regulations. user_1: I need help with my taxes this year.
    user_2: Are you looking for someone to prepare them for you?
    user_1: I'm thinking about getting a puppy.
  • Loss: MultipleNegativesRankingLoss with these parameters:
    {
        "scale": 20.0,
        "similarity_fct": "cos_sim"
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: steps
  • per_device_train_batch_size: 16
  • per_device_eval_batch_size: 16
  • learning_rate: 2e-05
  • num_train_epochs: 1
  • warmup_ratio: 0.1
  • batch_sampler: no_duplicates

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: steps
  • prediction_loss_only: True
  • per_device_train_batch_size: 16
  • per_device_eval_batch_size: 16
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 2e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1.0
  • num_train_epochs: 1
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.1
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: False
  • fp16: False
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: False
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: None
  • hub_always_push: False
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • eval_use_gather_object: False
  • average_tokens_across_devices: False
  • prompts: None
  • batch_sampler: no_duplicates
  • multi_dataset_batch_sampler: proportional

Framework Versions

  • Python: 3.12.9
  • Sentence Transformers: 4.1.0
  • Transformers: 4.52.4
  • PyTorch: 2.7.1
  • Accelerate: 1.8.1
  • Datasets: 3.6.0
  • Tokenizers: 0.21.1

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

MultipleNegativesRankingLoss

@misc{henderson2017efficient,
    title={Efficient Natural Language Response Suggestion for Smart Reply},
    author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
    year={2017},
    eprint={1705.00652},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}