SentenceTransformer based on AI-Growth-Lab/PatentSBERTa

This is a sentence-transformers model finetuned from AI-Growth-Lab/PatentSBERTa. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

  • Model Type: Sentence Transformer
  • Base model: AI-Growth-Lab/PatentSBERTa
  • Maximum Sequence Length: 384 tokens
  • Output Dimensionality: 768 dimensions
  • Similarity Function: Cosine Similarity

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 384, 'do_lower_case': False, 'architecture': 'MPNetModel'})
  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("sentence_transformers_model_id")
# Run inference
sentences = [
    'High reduction and backdrivability ACTUATion SYSTEM with minimal lateral encumbrance. Title: High reduction and backdrivability ACTUATion SYSTEM with minimal lateral encumbrance\n\nAbstract: \n\nBusiness Description: Typical robotic actuation units have a motor and its transmission system aligned with the axis of the joint they actuate, resulting in large lateral encumbrance of the whole system.\n\nThis solution aims at solving this problem and allows to locate the last transmission stage at an arbitrary distance from the axis of the screw.\n\nTech Features: This invention consists in a class of actuation systems which is specifically designed to minimize the lateral encumbrance of the exoskeletal system to maximize its practical usability. Its core components are a motor coupled to a leadscrew or ballscrew system and a further (arbitrary) transmission system to connect the nut of the screw based transmission to the output wheel.\n\nFurthermore, this solution allows to displace the location of the screw with respect to the final transmission stage, allowing the adaptation of the location of the different stages of the system without influencing the overall behavior.The designed anti-blockage system guarantees proper functioning of each mechanical component and high back-drivability of the overall system even for high overall gearing.\n\nApplications: Robotics.\n\nAdvantages: Minimal lateral encumbrance; High gearing; Highly Back-drivable; Torque estimation through current measurement; Capability of dislocating the screw from the point of application of force.',
    'High reduction and backdrivability ACTUATion SYSTEM with minimal lateral encumbrance. Title: High reduction and backdrivability ACTUATion SYSTEM with minimal lateral encumbrance\n\nAbstract: \n\nBusiness Description: Typical robotic actuation units have a motor and its transmission system aligned with the axis of the joint they actuate, resulting in large lateral encumbrance of the whole system.\n\nThis solution aims at solving this problem and allows to locate the last transmission stage at an arbitrary distance from the axis of the screw.\n\nTech Features: This invention consists in a class of actuation systems which is specifically designed to minimize the lateral encumbrance of the exoskeletal system to maximize its practical usability. Its core components are a motor coupled to a leadscrew or ballscrew system and a further (arbitrary) transmission system to connect the nut of the screw based transmission to the output wheel.\n\nFurthermore, this solution allows to displace the location of the screw with respect to the final transmission stage, allowing the adaptation of the location of the different stages of the system without influencing the overall behavior.The designed anti-blockage system guarantees proper functioning of each mechanical component and high back-drivability of the overall system even for high overall gearing.\n\nApplications: Robotics.\n\nAdvantages: Minimal lateral encumbrance; High gearing; Highly Back-drivable; Torque estimation through current measurement; Capability of dislocating the screw from the point of application of force.',
    'Mechanical towing for automatic vehicle convoys. Title: Mechanical towing for automatic vehicle convoys\n\nAbstract: \n\nBusiness Description: The patent allows a mechanical coupling between vehicles in order to guarantee the safety and operation of a convoy of even 10 vehicles capable of circulating as if it were a single vehicle. This technology therefore allows a single driver in the head vehicle to automatically control the convoy of vehicles connected, revolutionizing the local transport systems with transport services that would otherwise not be feasible.\n\nTech Features: The patented mobile mechanical connection couples two vehicles and transforms them into a convoy that moves like a single unit. By means of an automatic guidance system, the coupled vehicles are able to synchronize automatically with the movements of the head vehicle, ensuring compliance with the trajectory of the head vehicle. The mechanical connection also acts as a safeguard in case of malfunctions and motion "harmonizer". This system allows single vehicles to form convoys driven by the driver only on the leading vehicle: free-flow car-sharing vehicles can be relocated from areas where they would be stationed for a long time to areas where there is demand, forming a convoy also of 10 vehicles and moving them with one driver; buses can be extended for the central sections of the transport lines with the highest demand without increasing the drivers; transport systems can be created in which small buses gather people on call in the suburbs and form a single convoy that crosses the center with a single driver. The patented coupling mechanism allows passenger transport companies to re-organize their services by means of convoys of vehicles which combine the maintenance of the optimum passenger capacity and the social requirements imposed by pandemic of COVID-19 and the containment measures taken.\n\nApplications: Car-sharing services; Logistics and goods distribution companies; Public and private transport with variable capacity.\n\nAdvantages: Certified transport system; Responsible automation; Flexible and client-oriented transport system; Efficiency of transport and sharing systems; Management cost reduction, creating capillary and high capacity services.',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities)
# tensor([[1.0000, 1.0000, 0.5066],
#         [1.0000, 1.0000, 0.5066],
#         [0.5066, 0.5066, 1.0000]])

Evaluation

Metrics

Information Retrieval

Metric Value
cosine_accuracy@1 0.0394
cosine_accuracy@3 0.0878
cosine_accuracy@5 0.1412
cosine_accuracy@10 0.2036
cosine_precision@1 0.0394
cosine_precision@3 0.0293
cosine_precision@5 0.0282
cosine_precision@10 0.0204
cosine_recall@1 0.0394
cosine_recall@3 0.0878
cosine_recall@5 0.1412
cosine_recall@10 0.2036
cosine_ndcg@10 0.1085
cosine_mrr@10 0.0797
cosine_map@100 0.0898

Training Details

Training Dataset

Unnamed Dataset

  • Size: 6,288 training samples
  • Columns: sentence_0 and sentence_1
  • Approximate statistics based on the first 1000 samples:
    sentence_0 sentence_1
    type string string
    details
    • min: 4 tokens
    • mean: 12.3 tokens
    • max: 35 tokens
    • min: 8 tokens
    • mean: 12.08 tokens
    • max: 36 tokens
  • Samples:
    sentence_0 sentence_1
    Epigenetic regulation system for the control of target gene expression Applicant/Organization: FONDAZIONE ISTITUTO ITALIANO DI TECNOLOGIA
    System integrating a membrane humidifier and an adsorption-based storage for polymer membrane hydrogen fuel cell applications. Technical Classification: H01M
    Superconducting bipolar thermoelectric memory Technical Classification: G11C_11
  • Loss: MultipleNegativesRankingLoss with these parameters:
    {
        "scale": 20.0,
        "similarity_fct": "cos_sim",
        "gather_across_devices": false
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: steps
  • per_device_train_batch_size: 12
  • per_device_eval_batch_size: 12
  • num_train_epochs: 10
  • multi_dataset_batch_sampler: round_robin

All Hyperparameters

Click to expand
  • do_predict: False
  • eval_strategy: steps
  • prediction_loss_only: True
  • per_device_train_batch_size: 12
  • per_device_eval_batch_size: 12
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 5e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1
  • num_train_epochs: 10
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: None
  • warmup_ratio: None
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • enable_jit_checkpoint: False
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • use_cpu: False
  • seed: 42
  • data_seed: None
  • bf16: False
  • fp16: False
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: -1
  • ddp_backend: None
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: False
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • parallelism_config: None
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch_fused
  • optim_args: None
  • group_by_length: False
  • length_column_name: length
  • project: huggingface
  • trackio_space_id: trackio
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: None
  • hub_always_push: False
  • hub_revision: None
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • auto_find_batch_size: False
  • full_determinism: False
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • include_num_input_tokens_seen: no
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • liger_kernel_config: None
  • eval_use_gather_object: False
  • average_tokens_across_devices: True
  • use_cache: False
  • prompts: None
  • batch_sampler: batch_sampler
  • multi_dataset_batch_sampler: round_robin
  • router_mapping: {}
  • learning_rate_mapping: {}

Training Logs

Epoch Step Training Loss mnrl-val_cosine_ndcg@10
1.6835 500 0.0001 -
0.0954 50 - 0.0077
0.1908 100 - 0.0105
0.2863 150 - 0.0147
0.3817 200 - 0.0205
0.4771 250 - 0.0240
0.5725 300 - 0.0324
0.6679 350 - 0.0348
0.7634 400 - 0.0332
0.8588 450 - 0.0445
0.9542 500 2.3022 0.0491
1.0 524 - 0.0473
1.0496 550 - 0.0479
1.1450 600 - 0.0491
1.2405 650 - 0.0466
1.3359 700 - 0.0593
1.4313 750 - 0.0547
1.5267 800 - 0.0516
1.6221 850 - 0.0596
1.7176 900 - 0.0596
1.8130 950 - 0.0681
1.9084 1000 1.9086 0.0667
2.0 1048 - 0.0666
2.0038 1050 - 0.0686
2.0992 1100 - 0.0732
2.1947 1150 - 0.0686
2.2901 1200 - 0.0772
2.3855 1250 - 0.0752
2.4809 1300 - 0.0803
2.5763 1350 - 0.0721
2.6718 1400 - 0.0779
2.7672 1450 - 0.0745
2.8626 1500 1.6854 0.0866
2.9580 1550 - 0.0837
3.0 1572 - 0.0788
3.0534 1600 - 0.0752
3.1489 1650 - 0.0788
3.2443 1700 - 0.0845
3.3397 1750 - 0.0898
3.4351 1800 - 0.0920
3.5305 1850 - 0.0877
3.6260 1900 - 0.0926
3.7214 1950 - 0.0858
3.8168 2000 1.5140 0.0889
3.9122 2050 - 0.0882
4.0 2096 - 0.0848
4.0076 2100 - 0.0828
4.1031 2150 - 0.0910
4.1985 2200 - 0.0928
4.2939 2250 - 0.0913
4.3893 2300 - 0.0923
4.4847 2350 - 0.0888
4.5802 2400 - 0.0882
4.6756 2450 - 0.0987
4.7710 2500 1.3415 0.0954
4.8664 2550 - 0.0911
4.9618 2600 - 0.0932
5.0 2620 - 0.0887
5.0573 2650 - 0.0952
5.1527 2700 - 0.0954
5.2481 2750 - 0.0972
5.3435 2800 - 0.0957
5.4389 2850 - 0.0999
5.5344 2900 - 0.0964
5.6298 2950 - 0.0980
5.7252 3000 1.2411 0.0959
5.8206 3050 - 0.0943
5.9160 3100 - 0.0963
6.0 3144 - 0.0914
6.0115 3150 - 0.0915
6.1069 3200 - 0.0974
6.2023 3250 - 0.1019
6.2977 3300 - 0.1014
6.3931 3350 - 0.1037
6.4885 3400 - 0.0987
6.5840 3450 - 0.1010
6.6794 3500 1.1304 0.1064
6.7748 3550 - 0.1085

Framework Versions

  • Python: 3.12.12
  • Sentence Transformers: 5.2.3
  • Transformers: 5.0.0
  • PyTorch: 2.10.0+cu128
  • Accelerate: 1.13.0
  • Datasets: 4.0.0
  • Tokenizers: 0.22.2

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

MultipleNegativesRankingLoss

@misc{henderson2017efficient,
    title={Efficient Natural Language Response Suggestion for Smart Reply},
    author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
    year={2017},
    eprint={1705.00652},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}
Downloads last month
3
Safetensors
Model size
0.1B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for GiacomoSignorile/PatentBert-FineTuned

Finetuned
(20)
this model

Papers for GiacomoSignorile/PatentBert-FineTuned

Evaluation results