Upload folder using huggingface_hub

Browse files

Files changed (10) hide show

1_Pooling/config.json +10 -0
README.md +470 -3
config.json +24 -0
config_sentence_transformers.json +10 -0
modules.json +20 -0
sentence_bert_config.json +4 -0
special_tokens_map.json +37 -0
tokenizer.json +0 -0
tokenizer_config.json +56 -0
vocab.txt +0 -0

1_Pooling/config.json ADDED Viewed

	@@ -0,0 +1,10 @@

+{
+  "word_embedding_dimension": 1024,
+  "pooling_mode_cls_token": false,
+  "pooling_mode_mean_tokens": true,
+  "pooling_mode_max_tokens": false,
+  "pooling_mode_mean_sqrt_len_tokens": false,
+  "pooling_mode_weightedmean_tokens": false,
+  "pooling_mode_lasttoken": false,
+  "include_prompt": true
+}

README.md CHANGED Viewed

@@ -1,3 +1,470 @@
----
-license: mit
----

+---
+tags:
+- sentence-transformers
+- sentence-similarity
+- feature-extraction
+- generated_from_trainer
+- dataset_size:50000
+- loss:MultipleNegativesRankingLoss
+base_model: intfloat/e5-large-v2
+widget:
+- source_sentence: AVS Video Editor AVS Video Editor is a video editing software published
+    by Online Media Technologies Ltd. It is a part of AVS4YOU software suite which
+    includes video, audio, image editing and conversion, disc editing and burning,
+    document conversion and registry cleaner programs. It offers the opportunity to
+    create and edit videos with a vast variety of video and audio effects, text and
+    transitions; capture video from screen, web or DV cameras and VHS tape; record
+    voice; create menus for discs, as well as to save them to plenty of video file
+    formats, burn to discs or publish on Facebook, YouTube, Flickr, etc. 0
+  sentences:
+  - Adobe Premiere Rush is a video editing software that allows users to create and
+    edit videos for social media platforms such as YouTube, Instagram, and Facebook.
+    It is designed for quick and easy editing and can be accessed across multiple
+    devices. The software provides basic editing tools like trimming, cutting, transitions,
+    animations, and sound optimization. Its user-friendly interface makes it suitable
+    for beginners and requires minimal experience with video editing.
+  - H.245 is a protocol used for control signaling in videoconferencing systems. It
+    is responsible for negotiating the capabilities and settings of the system during
+    a call, such as video quality, audio codecs, and call setup. Understanding H.245
+    requires specialized knowledge of videoconferencing technologies and protocols.
+  - Coeliac disease is a long-term autoimmune disorder, primarily affecting the small
+    intestine, where individuals develop intolerance to gluten, present in foods such
+    as wheat, rye and barley. Classic symptoms include gastrointestinal problems such
+    as chronic diarrhoea, abdominal distention, malabsorption, loss of appetite, and
+    among children failure to grow normally. This often begins between six months
+    and two years of age. Non-classic symptoms are more common, especially in people
+    older than two years. There may be mild or absent gastrointestinal symptoms, a
+    wide number of symptoms involving any part of the body, or no obvious symptoms.
+    Coeliac disease was first described in childhood; however, it may develop at any
+    age. It is associated with other autoimmune diseases, such as Type 1 diabetes
+    mellitus and Hashimoto's thyroiditis, among others.
+- source_sentence: AWS App Mesh AWS App Mesh is a managed service that provides application-level
+    networking for microservices deployed on AWS or on-premises environments. It allows
+    for centrally managed traffic routing, service discovery, and observability across
+    multiple microservices. This software can help simplify the complexity of microservices
+    architectures and improve application resiliency, scalability, and performance.
+    0
+  sentences:
+  - A protocol analyzer is a tool used to capture and analyze signals and data traffic
+    over a communication channel. Such a channel varies from a local computer bus
+    to a satellite link, that provides a means of communication using a standard communication
+    protocol. Each type of communication protocol has a different tool to collect
+    and analyze signals and data.
+  - AWS App Mesh is a managed service that provides application-level networking for
+    microservices deployed on AWS or on-premises environments. It allows for centrally
+    managed traffic routing, service discovery, and observability across multiple
+    microservices. This software can help simplify the complexity of microservices
+    architectures and improve application resiliency, scalability, and performance.
+  - French is a Romance language of the Indo-European family. It descended from the
+    Vulgar Latin of the Roman Empire, as did all Romance languages. French evolved
+    from Gallo-Romance, the Latin spoken in Gaul, and more specifically in Northern
+    Gaul. Its closest relatives are the other langues d'oïl—languages historically
+    spoken in northern France and in southern Belgium, which French (Francien) largely
+    supplanted. French was also influenced by native Celtic languages of Northern
+    Roman Gaul like Gallia Belgica and by the (Germanic) Frankish language of the
+    post-Roman Frankish invaders. Today, owing to France's past overseas expansion,
+    there are numerous French-based creole languages, most notably Haitian Creole.
+    A French-speaking person or nation may be referred to as Francophone in both English
+    and French.
+- source_sentence: AVS Video Editor AVS Video Editor is a video editing software published
+    by Online Media Technologies Ltd. It is a part of AVS4YOU software suite which
+    includes video, audio, image editing and conversion, disc editing and burning,
+    document conversion and registry cleaner programs. It offers the opportunity to
+    create and edit videos with a vast variety of video and audio effects, text and
+    transitions; capture video from screen, web or DV cameras and VHS tape; record
+    voice; create menus for discs, as well as to save them to plenty of video file
+    formats, burn to discs or publish on Facebook, YouTube, Flickr, etc. 0
+  sentences:
+  - Neuropsychiatry or Organic Psychiatry is a branch of medicine that deals with
+    mental disorders attributable to diseases of the nervous system. It preceded the
+    current disciplines of psychiatry and neurology, which had common training, however,
+    psychiatry and neurology have subsequently split apart and are typically practiced
+    separately. Nevertheless, neuropsychiatry has become a growing subspecialty of
+    psychiatry and it is also closely related to the fields of neuropsychology and
+    behavioral neurology.
+  - 'Capital program management software (CPMS) refers to the systems that are currently
+    available that help building owner/operators, program managers, and construction
+    managers, control and manage the vast amount of information that capital construction
+    projects create. A collection, or portfolio of projects only makes this a bigger
+    challenge. These systems go by different names: capital project management software,
+    construction management software, project management information systems.'
+  - Video editing is the manipulation and arrangement of video shots. Video editing
+    is used to structure and present all video information, including films and television
+    shows, video advertisements and video essays. Video editing has been dramatically
+    democratized in recent years by editing software available for personal computers.
+    Editing video can be difficult and tedious, so several technologies have been
+    produced to aid people in this task. Pen based video editing software was developed
+    in order to give people a more intuitive and fast way to edit video.
+- source_sentence: AVEVA Plant SCADA Citect is now a group of industrial software
+    products sold by Aveva, but started as a software development company specialising
+    in the Automation and Control industry. The main software products developed by
+    Citect included CitectSCADA, CitectSCADA Reports, and Ampla. 0
+  sentences:
+  - A Bluetooth stack is software that refers to an implementation of the Bluetooth
+    protocol stack.
+  - Semikhah traditionally refers to the ordination of a rabbi within Judaism.
+  - Automation Studio is a circuit design, simulation and project documentation software
+    for fluid power systems and electrical projects conceived by Famic Technologies
+    Inc.. It is used for CAD, maintenance, and training purposes. Mainly used by engineers,
+    trainers, and service and maintenance personnel. Automation Studio can be applied
+    in the design, training and troubleshooting of hydraulics, pneumatics, HMI, and
+    electrical control systems.
+- source_sentence: AWS User Pools AWS User Pools is a fully managed user directory
+    service that allows application developers to easily add registration and login
+    functionality to their apps. It provides features such as multi-factor authentication,
+    password policies, social sign-in, and customizable email templates. AWS User
+    Pools allows developers to focus on building their applications while providing
+    secure and scalable user authentication and authorization. 0
+  sentences:
+  - AVS Video Editor is a video editing software published by Online Media Technologies
+    Ltd. It is a part of AVS4YOU software suite which includes video, audio, image
+    editing and conversion, disc editing and burning, document conversion and registry
+    cleaner programs. It offers the opportunity to create and edit videos with a vast
+    variety of video and audio effects, text and transitions; capture video from screen,
+    web or DV cameras and VHS tape; record voice; create menus for discs, as well
+    as to save them to plenty of video file formats, burn to discs or publish on Facebook,
+    YouTube, Flickr, etc.
+  - Oracle Waveset is an identity management system developed by Oracle Corporation.
+    It provides a centralized platform for managing user identities and access to
+    resources within an organization. It helps to streamline and automate the process
+    of user provisioning, de-provisioning, and managing access privileges. Oracle
+    Waveset also supports password management, authentication, and integration with
+    various authentication systems, such as LDAP and Active Directory. It is commonly
+    used in large-scale enterprises and organizations that require strict access control
+    and compliance with regulatory requirements.
+  - Gastritis is inflammation of the lining of the stomach. It may occur as a short
+    episode or may be of a long duration. There may be no symptoms but, when symptoms
+    are present, the most common is upper abdominal pain. Other possible symptoms
+    include nausea and vomiting, bloating, loss of appetite and heartburn. Complications
+    may include stomach bleeding, stomach ulcers, and stomach tumors. When due to
+    autoimmune problems, low red blood cells due to not enough vitamin B12 may occur,
+    a condition known as pernicious anemia.
+pipeline_tag: sentence-similarity
+library_name: sentence-transformers
+---
+# SentenceTransformer based on intfloat/e5-large-v2
+This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [intfloat/e5-large-v2](https://huggingface.co/intfloat/e5-large-v2). It maps sentences & paragraphs to a 1024-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
+## Model Details
+### Model Description
+- **Model Type:** Sentence Transformer
+- **Base model:** [intfloat/e5-large-v2](https://huggingface.co/intfloat/e5-large-v2) <!-- at revision f169b11e22de13617baa190a028a32f3493550b6 -->
+- **Maximum Sequence Length:** 512 tokens
+- **Output Dimensionality:** 1024 dimensions
+- **Similarity Function:** Cosine Similarity
+<!-- - **Training Dataset:** Unknown -->
+<!-- - **Language:** Unknown -->
+<!-- - **License:** Unknown -->
+### Model Sources
+- **Documentation:** [Sentence Transformers Documentation](https://sbert.net)
+- **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers)
+- **Hugging Face:** [Sentence Transformers on Hugging Face](https://huggingface.co/models?library=sentence-transformers)
+### Full Model Architecture
+```
+SentenceTransformer(
+  (0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: BertModel
+  (1): Pooling({'word_embedding_dimension': 1024, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
+  (2): Normalize()
+)
+```
+## Usage
+### Direct Usage (Sentence Transformers)
+First install the Sentence Transformers library:
+```bash
+pip install -U sentence-transformers
+```
+Then you can load this model and run inference.
+```python
+from sentence_transformers import SentenceTransformer
+# Download from the 🤗 Hub
+model = SentenceTransformer("sentence_transformers_model_id")
+# Run inference
+sentences = [
+    'AWS User Pools AWS User Pools is a fully managed user directory service that allows application developers to easily add registration and login functionality to their apps. It provides features such as multi-factor authentication, password policies, social sign-in, and customizable email templates. AWS User Pools allows developers to focus on building their applications while providing secure and scalable user authentication and authorization. 0',
+    'Oracle Waveset is an identity management system developed by Oracle Corporation. It provides a centralized platform for managing user identities and access to resources within an organization. It helps to streamline and automate the process of user provisioning, de-provisioning, and managing access privileges. Oracle Waveset also supports password management, authentication, and integration with various authentication systems, such as LDAP and Active Directory. It is commonly used in large-scale enterprises and organizations that require strict access control and compliance with regulatory requirements.',
+    'Gastritis is inflammation of the lining of the stomach. It may occur as a short episode or may be of a long duration. There may be no symptoms but, when symptoms are present, the most common is upper abdominal pain. Other possible symptoms include nausea and vomiting, bloating, loss of appetite and heartburn. Complications may include stomach bleeding, stomach ulcers, and stomach tumors. When due to autoimmune problems, low red blood cells due to not enough vitamin B12 may occur, a condition known as pernicious anemia.',
+]
+embeddings = model.encode(sentences)
+print(embeddings.shape)
+# [3, 1024]
+# Get the similarity scores for the embeddings
+similarities = model.similarity(embeddings, embeddings)
+print(similarities.shape)
+# [3, 3]
+```
+<!--
+### Direct Usage (Transformers)
+<details><summary>Click to see the direct usage in Transformers</summary>
+</details>
+-->
+<!--
+### Downstream Usage (Sentence Transformers)
+You can finetune this model on your own dataset.
+<details><summary>Click to expand</summary>
+</details>
+-->
+<!--
+### Out-of-Scope Use
+*List how the model may foreseeably be misused and address what users ought not to do with the model.*
+-->
+<!--
+## Bias, Risks and Limitations
+*What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
+-->
+<!--
+### Recommendations
+*What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
+-->
+## Training Details
+### Training Dataset
+#### Unnamed Dataset
+* Size: 50,000 training samples
+* Columns: <code>sentence_0</code>, <code>sentence_1</code>, <code>sentence_2</code>, and <code>label</code>
+* Approximate statistics based on the first 1000 samples:
+  |         | sentence_0                                                                         | sentence_1                                                                         | sentence_2                                                                          | label                        |
+  |:--------|:-----------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------|:------------------------------------------------------------------------------------|:-----------------------------|
+  | type    | string                                                                             | string                                                                             | string                                                                              | int                          |
+  | details | <ul><li>min: 36 tokens</li><li>mean: 90.1 tokens</li><li>max: 145 tokens</li></ul> | <ul><li>min: 3 tokens</li><li>mean: 81.16 tokens</li><li>max: 202 tokens</li></ul> | <ul><li>min: 16 tokens</li><li>mean: 83.93 tokens</li><li>max: 214 tokens</li></ul> | <ul><li>1: 100.00%</li></ul> |
+* Samples:
+  | sentence_0                                                                                                                                                             | sentence_1                                                                                                                                             | sentence_2                                                                                                                                                                                                                                                                                                                                                                                                                                                               | label          |
+  |:-----------------------------------------------------------------------------------------------------------------------------------------------------------------------|:-------------------------------------------------------------------------------------------------------------------------------------------------------|:-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:---------------|
+  | <code>Burroughs MCP The MCP is the proprietary operating system of the Burroughs small, medium and large systems, including the Unisys Clearpath/MCP systems. 0</code> | <code>The MCP is the proprietary operating system of the Burroughs small, medium and large systems, including the Unisys Clearpath/MCP systems.</code> | <code>Yellow fever is a viral disease of typically short duration. In most cases, symptoms include fever, chills, loss of appetite, nausea, muscle pains particularly in the back, and headaches. Symptoms typically improve within five days. In about 15% of people, within a day of improving the fever comes back, abdominal pain occurs, and liver damage begins causing yellow skin. If this occurs, the risk of bleeding and kidney problems is increased.</code> | <code>1</code> |
+  | <code>Burroughs MCP The MCP is the proprietary operating system of the Burroughs small, medium and large systems, including the Unisys Clearpath/MCP systems. 0</code> | <code>The MCP is the proprietary operating system of the Burroughs small, medium and large systems, including the Unisys Clearpath/MCP systems.</code> | <code>A wax sculpture is a depiction made using a waxy substance. Often these are effigies, usually of a notable individual, but there are also death masks and scenes with many figures, mostly in relief.</code>                                                                                                                                                                                                                                                       | <code>1</code> |
+  | <code>Burroughs MCP The MCP is the proprietary operating system of the Burroughs small, medium and large systems, including the Unisys Clearpath/MCP systems. 0</code> | <code>The MCP is the proprietary operating system of the Burroughs small, medium and large systems, including the Unisys Clearpath/MCP systems.</code> | <code>Hyperemesis gravidarum (HG) is a pregnancy complication that is characterized by severe nausea, vomiting, weight loss, and possibly dehydration. Feeling faint may also occur. It is considered more severe than morning sickness. Symptoms often get better after the 20th week of pregnancy but may last the entire pregnancy duration.</code>                                                                                                                   | <code>1</code> |
+* Loss: [<code>MultipleNegativesRankingLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#multiplenegativesrankingloss) with these parameters:
+  ```json
+  {
+      "scale": 20.0,
+      "similarity_fct": "cos_sim"
+  }
+  ```
+### Training Hyperparameters
+#### Non-Default Hyperparameters
+- `num_train_epochs`: 1
+- `fp16`: True
+- `multi_dataset_batch_sampler`: round_robin
+#### All Hyperparameters
+<details><summary>Click to expand</summary>
+- `overwrite_output_dir`: False
+- `do_predict`: False
+- `eval_strategy`: no
+- `prediction_loss_only`: True
+- `per_device_train_batch_size`: 8
+- `per_device_eval_batch_size`: 8
+- `per_gpu_train_batch_size`: None
+- `per_gpu_eval_batch_size`: None
+- `gradient_accumulation_steps`: 1
+- `eval_accumulation_steps`: None
+- `torch_empty_cache_steps`: None
+- `learning_rate`: 5e-05
+- `weight_decay`: 0.0
+- `adam_beta1`: 0.9
+- `adam_beta2`: 0.999
+- `adam_epsilon`: 1e-08
+- `max_grad_norm`: 1
+- `num_train_epochs`: 1
+- `max_steps`: -1
+- `lr_scheduler_type`: linear
+- `lr_scheduler_kwargs`: {}
+- `warmup_ratio`: 0.0
+- `warmup_steps`: 0
+- `log_level`: passive
+- `log_level_replica`: warning
+- `log_on_each_node`: True
+- `logging_nan_inf_filter`: True
+- `save_safetensors`: True
+- `save_on_each_node`: False
+- `save_only_model`: False
+- `restore_callback_states_from_checkpoint`: False
+- `no_cuda`: False
+- `use_cpu`: False
+- `use_mps_device`: False
+- `seed`: 42
+- `data_seed`: None
+- `jit_mode_eval`: False
+- `use_ipex`: False
+- `bf16`: False
+- `fp16`: True
+- `fp16_opt_level`: O1
+- `half_precision_backend`: auto
+- `bf16_full_eval`: False
+- `fp16_full_eval`: False
+- `tf32`: None
+- `local_rank`: 0
+- `ddp_backend`: None
+- `tpu_num_cores`: None
+- `tpu_metrics_debug`: False
+- `debug`: []
+- `dataloader_drop_last`: False
+- `dataloader_num_workers`: 0
+- `dataloader_prefetch_factor`: None
+- `past_index`: -1
+- `disable_tqdm`: False
+- `remove_unused_columns`: True
+- `label_names`: None
+- `load_best_model_at_end`: False
+- `ignore_data_skip`: False
+- `fsdp`: []
+- `fsdp_min_num_params`: 0
+- `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
+- `fsdp_transformer_layer_cls_to_wrap`: None
+- `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
+- `deepspeed`: None
+- `label_smoothing_factor`: 0.0
+- `optim`: adamw_torch
+- `optim_args`: None
+- `adafactor`: False
+- `group_by_length`: False
+- `length_column_name`: length
+- `ddp_find_unused_parameters`: None
+- `ddp_bucket_cap_mb`: None
+- `ddp_broadcast_buffers`: False
+- `dataloader_pin_memory`: True
+- `dataloader_persistent_workers`: False
+- `skip_memory_metrics`: True
+- `use_legacy_prediction_loop`: False
+- `push_to_hub`: False
+- `resume_from_checkpoint`: None
+- `hub_model_id`: None
+- `hub_strategy`: every_save
+- `hub_private_repo`: None
+- `hub_always_push`: False
+- `gradient_checkpointing`: False
+- `gradient_checkpointing_kwargs`: None
+- `include_inputs_for_metrics`: False
+- `include_for_metrics`: []
+- `eval_do_concat_batches`: True
+- `fp16_backend`: auto
+- `push_to_hub_model_id`: None
+- `push_to_hub_organization`: None
+- `mp_parameters`:
+- `auto_find_batch_size`: False
+- `full_determinism`: False
+- `torchdynamo`: None
+- `ray_scope`: last
+- `ddp_timeout`: 1800
+- `torch_compile`: False
+- `torch_compile_backend`: None
+- `torch_compile_mode`: None
+- `include_tokens_per_second`: False
+- `include_num_input_tokens_seen`: False
+- `neftune_noise_alpha`: None
+- `optim_target_modules`: None
+- `batch_eval_metrics`: False
+- `eval_on_start`: False
+- `use_liger_kernel`: False
+- `eval_use_gather_object`: False
+- `average_tokens_across_devices`: False
+- `prompts`: None
+- `batch_sampler`: batch_sampler
+- `multi_dataset_batch_sampler`: round_robin
+</details>
+### Training Logs
+| Epoch | Step | Training Loss |
+|:-----:|:----:|:-------------:|
+| 0.08  | 500  | 0.3751        |
+| 0.16  | 1000 | 0.1414        |
+| 0.24  | 1500 | 0.1219        |
+| 0.32  | 2000 | 0.0979        |
+| 0.4   | 2500 | 0.083         |
+| 0.48  | 3000 | 0.067         |
+| 0.56  | 3500 | 0.0645        |
+| 0.64  | 4000 | 0.0578        |
+| 0.72  | 4500 | 0.0454        |
+| 0.8   | 5000 | 0.0404        |
+| 0.88  | 5500 | 0.0419        |
+| 0.96  | 6000 | 0.0402        |
+### Framework Versions
+- Python: 3.11.13
+- Sentence Transformers: 4.1.0
+- Transformers: 4.52.4
+- PyTorch: 2.6.0+cu124
+- Accelerate: 1.8.1
+- Datasets: 3.6.0
+- Tokenizers: 0.21.2
+## Citation
+### BibTeX
+#### Sentence Transformers
+```bibtex
+@inproceedings{reimers-2019-sentence-bert,
+    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
+    author = "Reimers, Nils and Gurevych, Iryna",
+    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
+    month = "11",
+    year = "2019",
+    publisher = "Association for Computational Linguistics",
+    url = "https://arxiv.org/abs/1908.10084",
+}
+```
+#### MultipleNegativesRankingLoss
+```bibtex
+@misc{henderson2017efficient,
+    title={Efficient Natural Language Response Suggestion for Smart Reply},
+    author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
+    year={2017},
+    eprint={1705.00652},
+    archivePrefix={arXiv},
+    primaryClass={cs.CL}
+}
+```
+<!--
+## Glossary
+*Clearly define terms in order to be accessible across audiences.*
+-->
+<!--
+## Model Card Authors
+*Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
+-->
+<!--
+## Model Card Contact
+*Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
+-->

config.json ADDED Viewed

	@@ -0,0 +1,24 @@

+{
+  "architectures": [
+    "BertModel"
+  ],
+  "attention_probs_dropout_prob": 0.1,
+  "classifier_dropout": null,
+  "hidden_act": "gelu",
+  "hidden_dropout_prob": 0.1,
+  "hidden_size": 1024,
+  "initializer_range": 0.02,
+  "intermediate_size": 4096,
+  "layer_norm_eps": 1e-12,
+  "max_position_embeddings": 512,
+  "model_type": "bert",
+  "num_attention_heads": 16,
+  "num_hidden_layers": 24,
+  "pad_token_id": 0,
+  "position_embedding_type": "absolute",
+  "torch_dtype": "float32",
+  "transformers_version": "4.52.4",
+  "type_vocab_size": 2,
+  "use_cache": true,
+  "vocab_size": 30522
+}

config_sentence_transformers.json ADDED Viewed

	@@ -0,0 +1,10 @@

+{
+  "__version__": {
+    "sentence_transformers": "4.1.0",
+    "transformers": "4.52.4",
+    "pytorch": "2.6.0+cu124"
+  },
+  "prompts": {},
+  "default_prompt_name": null,
+  "similarity_fn_name": "cosine"
+}

modules.json ADDED Viewed

	@@ -0,0 +1,20 @@

+[
+  {
+    "idx": 0,
+    "name": "0",
+    "path": "",
+    "type": "sentence_transformers.models.Transformer"
+  },
+  {
+    "idx": 1,
+    "name": "1",
+    "path": "1_Pooling",
+    "type": "sentence_transformers.models.Pooling"
+  },
+  {
+    "idx": 2,
+    "name": "2",
+    "path": "2_Normalize",
+    "type": "sentence_transformers.models.Normalize"
+  }
+]

sentence_bert_config.json ADDED Viewed

	@@ -0,0 +1,4 @@

+{
+  "max_seq_length": 512,
+  "do_lower_case": false
+}

special_tokens_map.json ADDED Viewed

	@@ -0,0 +1,37 @@

+{
+  "cls_token": {
+    "content": "[CLS]",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  },
+  "mask_token": {
+    "content": "[MASK]",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  },
+  "pad_token": {
+    "content": "[PAD]",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  },
+  "sep_token": {
+    "content": "[SEP]",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  },
+  "unk_token": {
+    "content": "[UNK]",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  }
+}

tokenizer.json ADDED Viewed

The diff for this file is too large to render. See raw diff

tokenizer_config.json ADDED Viewed

	@@ -0,0 +1,56 @@

+{
+  "added_tokens_decoder": {
+    "0": {
+      "content": "[PAD]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "100": {
+      "content": "[UNK]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "101": {
+      "content": "[CLS]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "102": {
+      "content": "[SEP]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "103": {
+      "content": "[MASK]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    }
+  },
+  "clean_up_tokenization_spaces": true,
+  "cls_token": "[CLS]",
+  "do_lower_case": true,
+  "extra_special_tokens": {},
+  "mask_token": "[MASK]",
+  "model_max_length": 512,
+  "pad_token": "[PAD]",
+  "sep_token": "[SEP]",
+  "strip_accents": null,
+  "tokenize_chinese_chars": true,
+  "tokenizer_class": "BertTokenizer",
+  "unk_token": "[UNK]"
+}

vocab.txt ADDED Viewed

The diff for this file is too large to render. See raw diff