Add new SentenceTransformer model

Browse files

Files changed (11) hide show

1_Pooling/config.json +10 -0
README.md +524 -0
config.json +25 -0
config_sentence_transformers.json +10 -0
model.safetensors +3 -0
modules.json +20 -0
sentence_bert_config.json +4 -0
special_tokens_map.json +37 -0
tokenizer.json +0 -0
tokenizer_config.json +65 -0
vocab.txt +0 -0

1_Pooling/config.json ADDED Viewed

	@@ -0,0 +1,10 @@

+{
+  "word_embedding_dimension": 384,
+  "pooling_mode_cls_token": false,
+  "pooling_mode_mean_tokens": true,
+  "pooling_mode_max_tokens": false,
+  "pooling_mode_mean_sqrt_len_tokens": false,
+  "pooling_mode_weightedmean_tokens": false,
+  "pooling_mode_lasttoken": false,
+  "include_prompt": true
+}

README.md ADDED Viewed

	@@ -0,0 +1,524 @@

+---
+tags:
+- sentence-transformers
+- sentence-similarity
+- feature-extraction
+- generated_from_trainer
+- dataset_size:34235
+- loss:MultipleNegativesRankingLoss
+base_model: zacbrld/MNLP_M2_document_encoder
+widget:
+- source_sentence: What is This?
+  sentences:
+  - Bellcranks are also seen in automotive applications, such as in the linkage connecting
+    the throttle pedal to the carburetor or connecting the brake pedal to the master
+    cylinder In vehicle suspensions, bellcranks are used in pullrod and pushrod suspensions
+    in cars or in the Christie suspension in tanks More vertical suspension designs
+    such as MacPherson struts may not be feasible in some vehicle designs due to space,
+    aerodynamic, or other design constraints; bellcranks translate the vertical motion
+    of the wheel into horizontal motion, allowing the suspension to be mounted transversely
+    or longitudinally within the vehicle
+  - DynaMo was also used as the face of the BBC's parental assistance website This
+    was created for parents to assist children with homework There was also a section
+    called "DynaMo's Den" which included educational games for children The website
+    was activated on 2 October 1998
+  - "The diode equation above is an example of an element constitutive equation of\
+    \ the general form,\n\n  \n    \n      \n        f\n        (\n        v\n   \
+    \     ,\n        i\n        )\n        =\n        0\n      \n    \n    {\\displaystyle\
+    \ f(v,i)=0}\n  \n\nThis can be thought of as a non-linear resistor The corresponding\
+    \ constitutive equations for non-linear inductors and capacitors are respectively;\n\
+    \n  \n    \n      \n        f\n        (\n        v\n        ,\n        φ\n  \
+    \      )\n        =\n        0\n      \n    \n    {\\displaystyle f(v,\\varphi\
+    \ )=0}\n  \n\n  \n    \n      \n        f\n        (\n        v\n        ,\n \
+    \       q\n        )\n        =\n        0\n      \n    \n    {\\displaystyle\
+    \ f(v,q)=0}\n  \n\nwhere f is any arbitrary function, φ is the stored magnetic\
+    \ flux and q is the stored charge"
+- source_sentence: algorithm explanation
+  sentences:
+  - 'Descriptive statistics
+    Average
+    Mean
+    Median
+    Mode
+    Measures of scale
+    Variance
+    Standard deviation
+    Median absolute deviation
+    Correlation
+    Polychoric correlation
+    Outlier
+    Statistical graphics
+    Histogram
+    Frequency distribution
+    Quantile
+    Survival function
+    Failure rate
+    Scatter plot
+    Bar chart'
+  - 'The various fields and topics that projects engineers are involved with include:
+    Work breakdown structure: a deliverable-oriented breakdown of a project into smaller
+    components
+    Gantt chart:  type of bar chart that illustrates a project schedule
+    Critical Path Analysis: an algorithm for scheduling a set of project activities
+    Program evaluation and review technique: a statistical tool which was designed
+    to analyze and represent the tasks involved in completing a given project
+    Graphical Evaluation and Review Technique: network analysis technique that allows
+    probabilistic treatment both network logic and estimation of activity duration
+    Petri Nets: one of several mathematical modeling languages for the description
+    of distributed systems'
+  - Jessiko was marketed as a luxury decoration for businesses such as hotels, restaurants,
+    and museums Tiraby expressed hope that one day it would be common to find his
+    invention in household ponds and swimming pools
+- source_sentence: 'The firm was founded as SECOR Ltd in 1994 by John Leeson, Alan
+    Sheppard, and David Richards After establishing the company in Oxford, United
+    Kingdom, in 1994, David oversaw the growth of the business from a small UK operator
+    into an environmental consultancies in the UK, with international operations across
+    Africa, Australasia, Canada, Europe, and the US
+    In 2000, the senior management team completed a management buyout and the company''s
+    name was changed to SLR Consulting Limited In 2004 they secured funding from Livingbridge,
+    who invested £4 85 million as part of a £13 million investment including other
+    partners, and took a significant minority stake in the company In 2008, 3i invested
+    £32 5 million in the firm, and replaced Livingbridge with a significant minority
+    stake In March 2018, Charterhouse Capital Partners (CCP) acquired a majority shareholding
+    in the business In June 2022 Charterhouse Capital Partners agreed to a sale of
+    SLR Consulting to Ares Management private equity partners David Richards was Chief
+    Executive Officer from 1994–2013 In line with the Group''s succession plans, Neil
+    Penhall, formerly Managing Director of SLR Consulting and an Executive Director
+    of SLR Management, assumed the role of CEO'
+  sentences:
+  - 'Institute for Transuranium Elements (ITU)
+    Institute for the Protection and the Security of the Citizen (IPSC)
+    Institute for Environment and Sustainability (IES)
+    Institute for Health and Consumer Protection (IHCP)
+    Institute for Energy (IE)
+    Institute for Prospective Technological Studies (IPTS)'
+  - Project NExT was founded by James (Jim) Leitzel (Ohio State University) and Chris
+    Stevens (Saint Louis University) The first fellows were selected in 1994 Jim Leitzel
+    died in 1998, and Aparna Higgins (University of Dayton) and Joe Gallian (University
+    of Minnesota Duluth) became co-directors of Project NExT Chris Stevens stepped
+    down as director in 2010, and was succeeded by Aparna Higgins and Joe Gallian
+    Judith Covington (Louisiana State University, Shreveport) and Gavin LaRose (University
+    of Michigan) first served as Associate Co-Directors and later became Co-Directors
+    In 2007, the total number of fellows surpassed 1000 By 2017 the total number of
+    fellows reached 1700 In 2023 Christine Kelley became director
+  - Quantum secure communication is a method that is expected to be 'quantum safe'
+    in the advent of quantum computing systems that could break current cryptography
+    systems using methods such as Shor's algorithm These methods include quantum key
+    distribution (QKD), a method of transmitting information using entangled light
+    in a way that makes any interception of the transmission obvious to the user Another
+    method is the quantum random number generator, which is capable of producing truly
+    random numbers unlike non-quantum algorithms that merely imitate randomness
+- source_sentence: chemical reaction
+  sentences:
+  - With suitably encoded scales (multitrack, vernier, digital code, or pseudo-random
+    code) an encoder can determine its position without movement or needing to find
+    a reference position Such absolute encoders also communicate using serial communication
+    protocols Many of these protocols are proprietary (e g , Fanuc, Mitsubishi, FeeDat
+    (Fagor Automation), Heidenhain EnDat, DriveCliq, Panasonic, Yaskawa) but open
+    standards such as BiSS are now appearing, which avoid tying users to a particular
+    supplier
+  - Bonneau, Pierre; Allens, Gaspard d' (2020) Cent mille ans Bure ou le scandale
+    enfoui des déchets nucléaires [One hundred thousand years Bure, or the buried
+    scandal of nuclear waste] Illustrated by Cécile Guillard La Revue dessinée - Seuil
+    ISBN 978-2-02-145982-1
+  - The reason why MACE is heavily researched is that it allows completely anisotropic
+    etching of silicon substrates which is not possible with other wet chemical etching
+    methods (see figure to the right) Usually the silicon substrate is covered with
+    a protective layer such as photoresist before it is immersed in an etching solution
+    The etching solution usually has no preferred direction of attacking the substrate,
+    therefore isotropic etching takes place In semiconductor engineering, however
+    it is often required that the sidewalls of the etched trenches are steep This
+    is usually realized with methods that operate in the gas-phase such as reactive
+    ion etching These methods require expensive equipment compared to simple wet etching
+    MACE, in principle allows the fabrication of steep trenches but is still cheap
+    compared to gas-phase etching methods
+- source_sentence: synthesis method
+  sentences:
+  - STEMNET used to receive funding from the Department for Education and Skills Since
+    June 2007, it receives funding from the Department for Children, Schools and Families
+    and Department for Innovation, Universities and Skills, since STEMNET sits on
+    the chronological dividing point (age 16) of both of the new departments
+  - The Arab States of the Persian Gulf plan to start their own joint civilian nuclear
+    program An agreement in the final days of the Bush administration provided for
+    cooperation between the United Arab Emirates and the United States of America
+    in which the United States would sell the UAE nuclear reactors and nuclear fuel
+    The UAE would, in return, renounce their right to enrich uranium for their civilian
+    nuclear program At the time of signing, this agreement was touted as a way to
+    reduce risks of nuclear proliferation in the Persian Gulf However, Mustafa Alani
+    of the Dubai-based Gulf Research Center stated that, should the Nuclear Non-Proliferation
+    Treaty collapse, nuclear reactors such as those slated to be sold to the UAE under
+    this agreement could provide the UAE with a path toward a nuclear weapon, raising
+    the specter of further nuclear proliferation In March 2007, foreign ministers
+    of the six-member Gulf Cooperation Council met in Saudi Arabia to discuss progress
+    in plans agreed in December 2006, for a joint civilian nuclear program
+  - Timber framing dates back thousands of years, and has been used in many parts
+    of the world during various periods such as ancient Japan, Europe and medieval
+    England in localities where timber was in good supply and building stone and the
+    skills to work it were not The use of timber framing in buildings provides their
+    complete skeletal framing which offers some structural benefits as the timber
+    frame, if properly engineered, lends itself to better seismic survivability
+pipeline_tag: sentence-similarity
+library_name: sentence-transformers
+---
+# SentenceTransformer based on zacbrld/MNLP_M2_document_encoder
+This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [zacbrld/MNLP_M2_document_encoder](https://huggingface.co/zacbrld/MNLP_M2_document_encoder). It maps sentences & paragraphs to a 384-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
+## Model Details
+### Model Description
+- **Model Type:** Sentence Transformer
+- **Base model:** [zacbrld/MNLP_M2_document_encoder](https://huggingface.co/zacbrld/MNLP_M2_document_encoder) <!-- at revision 6f1d702dcb1d5e9fd30b691c84fadd9a1704a148 -->
+- **Maximum Sequence Length:** 256 tokens
+- **Output Dimensionality:** 384 dimensions
+- **Similarity Function:** Cosine Similarity
+<!-- - **Training Dataset:** Unknown -->
+<!-- - **Language:** Unknown -->
+<!-- - **License:** Unknown -->
+### Model Sources
+- **Documentation:** [Sentence Transformers Documentation](https://sbert.net)
+- **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers)
+- **Hugging Face:** [Sentence Transformers on Hugging Face](https://huggingface.co/models?library=sentence-transformers)
+### Full Model Architecture
+```
+SentenceTransformer(
+  (0): Transformer({'max_seq_length': 256, 'do_lower_case': False}) with Transformer model: BertModel
+  (1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
+  (2): Normalize()
+)
+```
+## Usage
+### Direct Usage (Sentence Transformers)
+First install the Sentence Transformers library:
+```bash
+pip install -U sentence-transformers
+```
+Then you can load this model and run inference.
+```python
+from sentence_transformers import SentenceTransformer
+# Download from the 🤗 Hub
+model = SentenceTransformer("zacbrld/MNLP_M3_document_encoder_V1")
+# Run inference
+sentences = [
+    'synthesis method',
+    'STEMNET used to receive funding from the Department for Education and Skills Since June 2007, it receives funding from the Department for Children, Schools and Families and Department for Innovation, Universities and Skills, since STEMNET sits on the chronological dividing point (age 16) of both of the new departments',
+    'The Arab States of the Persian Gulf plan to start their own joint civilian nuclear program An agreement in the final days of the Bush administration provided for cooperation between the United Arab Emirates and the United States of America in which the United States would sell the UAE nuclear reactors and nuclear fuel The UAE would, in return, renounce their right to enrich uranium for their civilian nuclear program At the time of signing, this agreement was touted as a way to reduce risks of nuclear proliferation in the Persian Gulf However, Mustafa Alani of the Dubai-based Gulf Research Center stated that, should the Nuclear Non-Proliferation Treaty collapse, nuclear reactors such as those slated to be sold to the UAE under this agreement could provide the UAE with a path toward a nuclear weapon, raising the specter of further nuclear proliferation In March 2007, foreign ministers of the six-member Gulf Cooperation Council met in Saudi Arabia to discuss progress in plans agreed in December 2006, for a joint civilian nuclear program',
+]
+embeddings = model.encode(sentences)
+print(embeddings.shape)
+# [3, 384]
+# Get the similarity scores for the embeddings
+similarities = model.similarity(embeddings, embeddings)
+print(similarities.shape)
+# [3, 3]
+```
+<!--
+### Direct Usage (Transformers)
+<details><summary>Click to see the direct usage in Transformers</summary>
+</details>
+-->
+<!--
+### Downstream Usage (Sentence Transformers)
+You can finetune this model on your own dataset.
+<details><summary>Click to expand</summary>
+</details>
+-->
+<!--
+### Out-of-Scope Use
+*List how the model may foreseeably be misused and address what users ought not to do with the model.*
+-->
+<!--
+## Bias, Risks and Limitations
+*What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
+-->
+<!--
+### Recommendations
+*What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
+-->
+## Training Details
+### Training Dataset
+#### Unnamed Dataset
+* Size: 34,235 training samples
+* Columns: <code>sentence_0</code> and <code>sentence_1</code>
+* Approximate statistics based on the first 1000 samples:
+  |         | sentence_0                                                                         | sentence_1                                                                           |
+  |:--------|:-----------------------------------------------------------------------------------|:-------------------------------------------------------------------------------------|
+  | type    | string                                                                             | string                                                                               |
+  | details | <ul><li>min: 3 tokens</li><li>mean: 21.24 tokens</li><li>max: 256 tokens</li></ul> | <ul><li>min: 34 tokens</li><li>mean: 133.62 tokens</li><li>max: 256 tokens</li></ul> |
+* Samples:
+  | sentence_0                        | sentence_1                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  |
+  |:----------------------------------|:------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
+  | <code>chemistry experiment</code> | <code>Since 1982, research has been conducted to develop technologies, commonly referred to as electronic noses, that could detect and recognize odors and flavors Application areas include food, medicine and the environment</code>                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      |
+  | <code>quantum physics</code>      | <code>Hydro electric - Hydro-electric turbomachinery uses potential energy stored in water to flow over an open impeller to turn a generator which creates electricity<br>Steam turbines - Steam turbines used in power generation come in many different variations The overall principle is high pressure steam is forced over blades attached to a shaft, which turns a generator As the steam travels through the turbine, it passes through smaller blades causing the shaft to spin faster, creating more electricity Gas turbines - Gas turbines work much like steam turbines Air is forced in through a series of blades that turn a shaft Then fuel is mixed with the air and causes a combustion reaction, increasing the power This then causes the shaft to spin faster, creating more electricity Windmills - Also known as a wind turbine, windmills are increasing in popularity for their ability to efficiently use the wind to generate electricity Although they come in many shapes and sizes, the most common one is the la...</code> |
+  | <code>physics law</code>          | <code>Backlash in gear couplings allows for slight angular misalignment There can be significant backlash in unsynchronized transmissions because of the intentional gap between the dogs in dog clutches The gap is necessary to engage dogs when input shaft (engine) speed and output shaft (driveshaft) speed are imperfectly synchronized If there was a smaller clearance, it would be nearly impossible to engage the gears because the dogs would interfere with each other in most configurations In synchronized transmissions, synchromesh solves this problem However, backlash is undesirable in precision positioning applications such as machine tool tables It can be minimized by choosing ball screws or leadscrews with preloaded nuts, and mounting them in preloaded bearings A preloaded bearing uses a spring and/or a second bearing to provide a compressive axial force that maintains bearing surfaces in contact despite reversal of the load direction</code>                                                                 |
+* Loss: [<code>MultipleNegativesRankingLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#multiplenegativesrankingloss) with these parameters:
+  ```json
+  {
+      "scale": 20.0,
+      "similarity_fct": "cos_sim"
+  }
+  ```
+### Training Hyperparameters
+#### Non-Default Hyperparameters
+- `num_train_epochs`: 2
+- `multi_dataset_batch_sampler`: round_robin
+#### All Hyperparameters
+<details><summary>Click to expand</summary>
+- `overwrite_output_dir`: False
+- `do_predict`: False
+- `eval_strategy`: no
+- `prediction_loss_only`: True
+- `per_device_train_batch_size`: 8
+- `per_device_eval_batch_size`: 8
+- `per_gpu_train_batch_size`: None
+- `per_gpu_eval_batch_size`: None
+- `gradient_accumulation_steps`: 1
+- `eval_accumulation_steps`: None
+- `torch_empty_cache_steps`: None
+- `learning_rate`: 5e-05
+- `weight_decay`: 0.0
+- `adam_beta1`: 0.9
+- `adam_beta2`: 0.999
+- `adam_epsilon`: 1e-08
+- `max_grad_norm`: 1
+- `num_train_epochs`: 2
+- `max_steps`: -1
+- `lr_scheduler_type`: linear
+- `lr_scheduler_kwargs`: {}
+- `warmup_ratio`: 0.0
+- `warmup_steps`: 0
+- `log_level`: passive
+- `log_level_replica`: warning
+- `log_on_each_node`: True
+- `logging_nan_inf_filter`: True
+- `save_safetensors`: True
+- `save_on_each_node`: False
+- `save_only_model`: False
+- `restore_callback_states_from_checkpoint`: False
+- `no_cuda`: False
+- `use_cpu`: False
+- `use_mps_device`: False
+- `seed`: 42
+- `data_seed`: None
+- `jit_mode_eval`: False
+- `use_ipex`: False
+- `bf16`: False
+- `fp16`: False
+- `fp16_opt_level`: O1
+- `half_precision_backend`: auto
+- `bf16_full_eval`: False
+- `fp16_full_eval`: False
+- `tf32`: None
+- `local_rank`: 0
+- `ddp_backend`: None
+- `tpu_num_cores`: None
+- `tpu_metrics_debug`: False
+- `debug`: []
+- `dataloader_drop_last`: False
+- `dataloader_num_workers`: 0
+- `dataloader_prefetch_factor`: None
+- `past_index`: -1
+- `disable_tqdm`: False
+- `remove_unused_columns`: True
+- `label_names`: None
+- `load_best_model_at_end`: False
+- `ignore_data_skip`: False
+- `fsdp`: []
+- `fsdp_min_num_params`: 0
+- `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
+- `fsdp_transformer_layer_cls_to_wrap`: None
+- `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
+- `deepspeed`: None
+- `label_smoothing_factor`: 0.0
+- `optim`: adamw_torch
+- `optim_args`: None
+- `adafactor`: False
+- `group_by_length`: False
+- `length_column_name`: length
+- `ddp_find_unused_parameters`: None
+- `ddp_bucket_cap_mb`: None
+- `ddp_broadcast_buffers`: False
+- `dataloader_pin_memory`: True
+- `dataloader_persistent_workers`: False
+- `skip_memory_metrics`: True
+- `use_legacy_prediction_loop`: False
+- `push_to_hub`: False
+- `resume_from_checkpoint`: None
+- `hub_model_id`: None
+- `hub_strategy`: every_save
+- `hub_private_repo`: None
+- `hub_always_push`: False
+- `gradient_checkpointing`: False
+- `gradient_checkpointing_kwargs`: None
+- `include_inputs_for_metrics`: False
+- `include_for_metrics`: []
+- `eval_do_concat_batches`: True
+- `fp16_backend`: auto
+- `push_to_hub_model_id`: None
+- `push_to_hub_organization`: None
+- `mp_parameters`:
+- `auto_find_batch_size`: False
+- `full_determinism`: False
+- `torchdynamo`: None
+- `ray_scope`: last
+- `ddp_timeout`: 1800
+- `torch_compile`: False
+- `torch_compile_backend`: None
+- `torch_compile_mode`: None
+- `include_tokens_per_second`: False
+- `include_num_input_tokens_seen`: False
+- `neftune_noise_alpha`: None
+- `optim_target_modules`: None
+- `batch_eval_metrics`: False
+- `eval_on_start`: False
+- `use_liger_kernel`: False
+- `eval_use_gather_object`: False
+- `average_tokens_across_devices`: False
+- `prompts`: None
+- `batch_sampler`: batch_sampler
+- `multi_dataset_batch_sampler`: round_robin
+</details>
+### Training Logs
+| Epoch  | Step | Training Loss |
+|:------:|:----:|:-------------:|
+| 0.1168 | 500  | 1.465         |
+| 0.2336 | 1000 | 1.189         |
+| 0.3505 | 1500 | 1.1209        |
+| 0.4673 | 2000 | 1.0333        |
+| 0.5841 | 2500 | 0.993         |
+| 0.7009 | 3000 | 0.9573        |
+| 0.8178 | 3500 | 0.9275        |
+| 0.9346 | 4000 | 0.9177        |
+| 1.0514 | 4500 | 0.8241        |
+| 1.1682 | 5000 | 0.7726        |
+| 1.2850 | 5500 | 0.7685        |
+| 1.4019 | 6000 | 0.7623        |
+| 1.5187 | 6500 | 0.7668        |
+| 1.6355 | 7000 | 0.7556        |
+| 1.7523 | 7500 | 0.7002        |
+| 1.8692 | 8000 | 0.7363        |
+| 1.9860 | 8500 | 0.7396        |
+### Framework Versions
+- Python: 3.12.8
+- Sentence Transformers: 3.4.1
+- Transformers: 4.52.2
+- PyTorch: 2.7.0+cu126
+- Accelerate: 1.3.0
+- Datasets: 3.2.0
+- Tokenizers: 0.21.0
+## Citation
+### BibTeX
+#### Sentence Transformers
+```bibtex
+@inproceedings{reimers-2019-sentence-bert,
+    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
+    author = "Reimers, Nils and Gurevych, Iryna",
+    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
+    month = "11",
+    year = "2019",
+    publisher = "Association for Computational Linguistics",
+    url = "https://arxiv.org/abs/1908.10084",
+}
+```
+#### MultipleNegativesRankingLoss
+```bibtex
+@misc{henderson2017efficient,
+    title={Efficient Natural Language Response Suggestion for Smart Reply},
+    author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
+    year={2017},
+    eprint={1705.00652},
+    archivePrefix={arXiv},
+    primaryClass={cs.CL}
+}
+```
+<!--
+## Glossary
+*Clearly define terms in order to be accessible across audiences.*
+-->
+<!--
+## Model Card Authors
+*Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
+-->
+<!--
+## Model Card Contact
+*Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
+-->

config.json ADDED Viewed

	@@ -0,0 +1,25 @@

+{
+  "architectures": [
+    "BertModel"
+  ],
+  "attention_probs_dropout_prob": 0.1,
+  "classifier_dropout": null,
+  "gradient_checkpointing": false,
+  "hidden_act": "gelu",
+  "hidden_dropout_prob": 0.1,
+  "hidden_size": 384,
+  "initializer_range": 0.02,
+  "intermediate_size": 1536,
+  "layer_norm_eps": 1e-12,
+  "max_position_embeddings": 512,
+  "model_type": "bert",
+  "num_attention_heads": 12,
+  "num_hidden_layers": 6,
+  "pad_token_id": 0,
+  "position_embedding_type": "absolute",
+  "torch_dtype": "float32",
+  "transformers_version": "4.52.2",
+  "type_vocab_size": 2,
+  "use_cache": true,
+  "vocab_size": 30522
+}

config_sentence_transformers.json ADDED Viewed

	@@ -0,0 +1,10 @@

+{
+  "__version__": {
+    "sentence_transformers": "3.4.1",
+    "transformers": "4.52.2",
+    "pytorch": "2.7.0+cu126"
+  },
+  "prompts": {},
+  "default_prompt_name": null,
+  "similarity_fn_name": "cosine"
+}

model.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:42ab139d799878bacbe9a0fb469dd44bdc78dcc0a5805d6b8154429956ccbe85
+size 90864192

modules.json ADDED Viewed

	@@ -0,0 +1,20 @@

+[
+  {
+    "idx": 0,
+    "name": "0",
+    "path": "",
+    "type": "sentence_transformers.models.Transformer"
+  },
+  {
+    "idx": 1,
+    "name": "1",
+    "path": "1_Pooling",
+    "type": "sentence_transformers.models.Pooling"
+  },
+  {
+    "idx": 2,
+    "name": "2",
+    "path": "2_Normalize",
+    "type": "sentence_transformers.models.Normalize"
+  }
+]

sentence_bert_config.json ADDED Viewed

	@@ -0,0 +1,4 @@

+{
+  "max_seq_length": 256,
+  "do_lower_case": false
+}

special_tokens_map.json ADDED Viewed

	@@ -0,0 +1,37 @@

+{
+  "cls_token": {
+    "content": "[CLS]",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  },
+  "mask_token": {
+    "content": "[MASK]",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  },
+  "pad_token": {
+    "content": "[PAD]",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  },
+  "sep_token": {
+    "content": "[SEP]",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  },
+  "unk_token": {
+    "content": "[UNK]",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  }
+}

tokenizer.json ADDED Viewed

The diff for this file is too large to render. See raw diff

tokenizer_config.json ADDED Viewed

	@@ -0,0 +1,65 @@

+{
+  "added_tokens_decoder": {
+    "0": {
+      "content": "[PAD]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "100": {
+      "content": "[UNK]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "101": {
+      "content": "[CLS]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "102": {
+      "content": "[SEP]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "103": {
+      "content": "[MASK]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    }
+  },
+  "clean_up_tokenization_spaces": false,
+  "cls_token": "[CLS]",
+  "do_basic_tokenize": true,
+  "do_lower_case": true,
+  "extra_special_tokens": {},
+  "mask_token": "[MASK]",
+  "max_length": 128,
+  "model_max_length": 256,
+  "never_split": null,
+  "pad_to_multiple_of": null,
+  "pad_token": "[PAD]",
+  "pad_token_type_id": 0,
+  "padding_side": "right",
+  "sep_token": "[SEP]",
+  "stride": 0,
+  "strip_accents": null,
+  "tokenize_chinese_chars": true,
+  "tokenizer_class": "BertTokenizer",
+  "truncation_side": "right",
+  "truncation_strategy": "longest_first",
+  "unk_token": "[UNK]"
+}

vocab.txt ADDED Viewed

The diff for this file is too large to render. See raw diff