Sentence Similarity
sentence-transformers
Safetensors
bert
feature-extraction
Generated from Trainer
dataset_size:1022
loss:MultipleNegativesRankingLoss
text-embeddings-inference
Instructions to use G-UDS/disaster_ko-bert with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- sentence-transformers
How to use G-UDS/disaster_ko-bert with sentence-transformers:
from sentence_transformers import SentenceTransformer model = SentenceTransformer("G-UDS/disaster_ko-bert") sentences = [ "토목섬유튜브로 보강한 철도 교대 접속부 구조의 장기안정성 평가", "A Study on Mechanism of Fire Spread between Rooms", "Assessement of Long Term Stability of Railway Bridge Abutment Using Geosynthetics Tube", "Analysis on Reliability for the Storm Sewer considering Sedimentation" ] embeddings = model.encode(sentences) similarities = model.similarity(embeddings, embeddings) print(similarities.shape) # [4, 4] - Notebooks
- Google Colab
- Kaggle
Upload 11 files
Browse files- 1_Pooling/config.json +10 -0
- README.md +357 -3
- config.json +26 -0
- config_sentence_transformers.json +10 -0
- model.safetensors +3 -0
- modules.json +20 -0
- sentence_bert_config.json +4 -0
- special_tokens_map.json +37 -0
- tokenizer.json +0 -0
- tokenizer_config.json +65 -0
- vocab.txt +0 -0
1_Pooling/config.json
ADDED
|
@@ -0,0 +1,10 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"word_embedding_dimension": 384,
|
| 3 |
+
"pooling_mode_cls_token": false,
|
| 4 |
+
"pooling_mode_mean_tokens": true,
|
| 5 |
+
"pooling_mode_max_tokens": false,
|
| 6 |
+
"pooling_mode_mean_sqrt_len_tokens": false,
|
| 7 |
+
"pooling_mode_weightedmean_tokens": false,
|
| 8 |
+
"pooling_mode_lasttoken": false,
|
| 9 |
+
"include_prompt": true
|
| 10 |
+
}
|
README.md
CHANGED
|
@@ -1,3 +1,357 @@
|
|
| 1 |
-
---
|
| 2 |
-
|
| 3 |
-
--
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
tags:
|
| 3 |
+
- sentence-transformers
|
| 4 |
+
- sentence-similarity
|
| 5 |
+
- feature-extraction
|
| 6 |
+
- generated_from_trainer
|
| 7 |
+
- dataset_size:1022
|
| 8 |
+
- loss:MultipleNegativesRankingLoss
|
| 9 |
+
base_model: sentence-transformers/all-MiniLM-L6-v2
|
| 10 |
+
widget:
|
| 11 |
+
- source_sentence: 토목섬유튜브로 보강한 철도 교대 접속부 구조의 장기안정성 평가
|
| 12 |
+
sentences:
|
| 13 |
+
- A Study on Mechanism of Fire Spread between Rooms
|
| 14 |
+
- Assessement of Long Term Stability of Railway Bridge Abutment Using Geosynthetics
|
| 15 |
+
Tube
|
| 16 |
+
- Analysis on Reliability for the Storm Sewer considering Sedimentation
|
| 17 |
+
- source_sentence: 진동측정에 따른 한옥 건축물의 고유주기
|
| 18 |
+
sentences:
|
| 19 |
+
- R&D Capability Analysis of Domestic Fire-fighting Safety and Rescue Research Program
|
| 20 |
+
- Arrangements of Rail Accident Command Structure, Roles and Responsibilities for
|
| 21 |
+
Infrastructure Manager and Train Undertakings
|
| 22 |
+
- Fundamental Period Formulas for The Korean-style House Using Ambient Vibration
|
| 23 |
+
- source_sentence: 산악트램 객실 쾌적성 향상을 위한 저진동 랙앤피니언 추진장치 개발
|
| 24 |
+
sentences:
|
| 25 |
+
- Development of a Low Vibration Rack&Pinion Traction System for More Comfortable
|
| 26 |
+
Cabin on Mountain Tram
|
| 27 |
+
- Risk Assessment of Heavy Snowfall Using PROMETHEE - The Case of Gangwon Province
|
| 28 |
+
-
|
| 29 |
+
- Comparison of Selection Methods for Proxy Variables on Flood Vulnerability Analysis
|
| 30 |
+
in South Korea and Thailand
|
| 31 |
+
- source_sentence: 옥상녹화의 수문학적 성능평가에 따른 최적 토양층 깊이 산정 연구
|
| 32 |
+
sentences:
|
| 33 |
+
- Field Measurements for Subgrade Compaction Using MEMS Accelerometers
|
| 34 |
+
- Study for Estimation of Optimal Soil Layer Depth according to the Evaluation of
|
| 35 |
+
Green Roof Hydrological Performance
|
| 36 |
+
- Toxicity Factor Analysis through the Exposure Experiment of the Combustion Products
|
| 37 |
+
on Wood-Based Materials
|
| 38 |
+
- source_sentence: 합성 나무류의 연소특성에 관한 연구
|
| 39 |
+
sentences:
|
| 40 |
+
- Study on Combustion Characteristics of Composite Wood Flow
|
| 41 |
+
- Induction Waterway Review by Debris Flow's Characteristics
|
| 42 |
+
- Slope Stability Analysis on Unsaturated Soil by Probable Rainfall Intensity
|
| 43 |
+
pipeline_tag: sentence-similarity
|
| 44 |
+
library_name: sentence-transformers
|
| 45 |
+
---
|
| 46 |
+
|
| 47 |
+
# SentenceTransformer based on sentence-transformers/all-MiniLM-L6-v2
|
| 48 |
+
|
| 49 |
+
This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [sentence-transformers/all-MiniLM-L6-v2](https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2). It maps sentences & paragraphs to a 384-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
|
| 50 |
+
|
| 51 |
+
## Model Details
|
| 52 |
+
|
| 53 |
+
### Model Description
|
| 54 |
+
- **Model Type:** Sentence Transformer
|
| 55 |
+
- **Base model:** [sentence-transformers/all-MiniLM-L6-v2](https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2) <!-- at revision fa97f6e7cb1a59073dff9e6b13e2715cf7475ac9 -->
|
| 56 |
+
- **Maximum Sequence Length:** 256 tokens
|
| 57 |
+
- **Output Dimensionality:** 384 dimensions
|
| 58 |
+
- **Similarity Function:** Cosine Similarity
|
| 59 |
+
<!-- - **Training Dataset:** Unknown -->
|
| 60 |
+
<!-- - **Language:** Unknown -->
|
| 61 |
+
<!-- - **License:** Unknown -->
|
| 62 |
+
|
| 63 |
+
### Model Sources
|
| 64 |
+
|
| 65 |
+
- **Documentation:** [Sentence Transformers Documentation](https://sbert.net)
|
| 66 |
+
- **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers)
|
| 67 |
+
- **Hugging Face:** [Sentence Transformers on Hugging Face](https://huggingface.co/models?library=sentence-transformers)
|
| 68 |
+
|
| 69 |
+
### Full Model Architecture
|
| 70 |
+
|
| 71 |
+
```
|
| 72 |
+
SentenceTransformer(
|
| 73 |
+
(0): Transformer({'max_seq_length': 256, 'do_lower_case': False}) with Transformer model: BertModel
|
| 74 |
+
(1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
|
| 75 |
+
(2): Normalize()
|
| 76 |
+
)
|
| 77 |
+
```
|
| 78 |
+
|
| 79 |
+
## Usage
|
| 80 |
+
|
| 81 |
+
### Direct Usage (Sentence Transformers)
|
| 82 |
+
|
| 83 |
+
First install the Sentence Transformers library:
|
| 84 |
+
|
| 85 |
+
```bash
|
| 86 |
+
pip install -U sentence-transformers
|
| 87 |
+
```
|
| 88 |
+
|
| 89 |
+
Then you can load this model and run inference.
|
| 90 |
+
```python
|
| 91 |
+
from sentence_transformers import SentenceTransformer
|
| 92 |
+
|
| 93 |
+
# Download from the 🤗 Hub
|
| 94 |
+
model = SentenceTransformer("sentence_transformers_model_id")
|
| 95 |
+
# Run inference
|
| 96 |
+
sentences = [
|
| 97 |
+
'합성 나무류의 연소특성에 관한 연구',
|
| 98 |
+
'Study on Combustion Characteristics of Composite Wood Flow',
|
| 99 |
+
'Slope Stability Analysis on Unsaturated Soil by Probable Rainfall Intensity',
|
| 100 |
+
]
|
| 101 |
+
embeddings = model.encode(sentences)
|
| 102 |
+
print(embeddings.shape)
|
| 103 |
+
# [3, 384]
|
| 104 |
+
|
| 105 |
+
# Get the similarity scores for the embeddings
|
| 106 |
+
similarities = model.similarity(embeddings, embeddings)
|
| 107 |
+
print(similarities.shape)
|
| 108 |
+
# [3, 3]
|
| 109 |
+
```
|
| 110 |
+
|
| 111 |
+
<!--
|
| 112 |
+
### Direct Usage (Transformers)
|
| 113 |
+
|
| 114 |
+
<details><summary>Click to see the direct usage in Transformers</summary>
|
| 115 |
+
|
| 116 |
+
</details>
|
| 117 |
+
-->
|
| 118 |
+
|
| 119 |
+
<!--
|
| 120 |
+
### Downstream Usage (Sentence Transformers)
|
| 121 |
+
|
| 122 |
+
You can finetune this model on your own dataset.
|
| 123 |
+
|
| 124 |
+
<details><summary>Click to expand</summary>
|
| 125 |
+
|
| 126 |
+
</details>
|
| 127 |
+
-->
|
| 128 |
+
|
| 129 |
+
<!--
|
| 130 |
+
### Out-of-Scope Use
|
| 131 |
+
|
| 132 |
+
*List how the model may foreseeably be misused and address what users ought not to do with the model.*
|
| 133 |
+
-->
|
| 134 |
+
|
| 135 |
+
<!--
|
| 136 |
+
## Bias, Risks and Limitations
|
| 137 |
+
|
| 138 |
+
*What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
|
| 139 |
+
-->
|
| 140 |
+
|
| 141 |
+
<!--
|
| 142 |
+
### Recommendations
|
| 143 |
+
|
| 144 |
+
*What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
|
| 145 |
+
-->
|
| 146 |
+
|
| 147 |
+
## Training Details
|
| 148 |
+
|
| 149 |
+
### Training Dataset
|
| 150 |
+
|
| 151 |
+
#### Unnamed Dataset
|
| 152 |
+
|
| 153 |
+
|
| 154 |
+
* Size: 1,022 training samples
|
| 155 |
+
* Columns: <code>sentence_0</code> and <code>sentence_1</code>
|
| 156 |
+
* Approximate statistics based on the first 1000 samples:
|
| 157 |
+
| | sentence_0 | sentence_1 |
|
| 158 |
+
|:--------|:-----------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|
|
| 159 |
+
| type | string | string |
|
| 160 |
+
| details | <ul><li>min: 3 tokens</li><li>mean: 54.35 tokens</li><li>max: 156 tokens</li></ul> | <ul><li>min: 8 tokens</li><li>mean: 18.11 tokens</li><li>max: 58 tokens</li></ul> |
|
| 161 |
+
* Samples:
|
| 162 |
+
| sentence_0 | sentence_1 |
|
| 163 |
+
|:---------------------------------------------|:-----------------------------------------------------------------------------------------------------|
|
| 164 |
+
| <code>재해지도 활용성 증대를 위한 빅데이터 구축 및 적용 방안</code> | <code>Building and Applying Scheme of Big Data for Enhancement of Hazard Map Utilization</code> |
|
| 165 |
+
| <code>강우의 간헐성이 크리깅에 미치는 영향 평가</code> | <code>Evaluation of Rainfall Intermittency on the Simple Kriging</code> |
|
| 166 |
+
| <code>토석류 발생지역의 지형적 특성을 고려한 위험도 분석</code> | <code>Risk Analysis Considering the Topography Characteristics of Debris Flow Occurrence Area</code> |
|
| 167 |
+
* Loss: [<code>MultipleNegativesRankingLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#multiplenegativesrankingloss) with these parameters:
|
| 168 |
+
```json
|
| 169 |
+
{
|
| 170 |
+
"scale": 20.0,
|
| 171 |
+
"similarity_fct": "cos_sim"
|
| 172 |
+
}
|
| 173 |
+
```
|
| 174 |
+
|
| 175 |
+
### Training Hyperparameters
|
| 176 |
+
#### Non-Default Hyperparameters
|
| 177 |
+
|
| 178 |
+
- `per_device_train_batch_size`: 16
|
| 179 |
+
- `per_device_eval_batch_size`: 16
|
| 180 |
+
- `multi_dataset_batch_sampler`: round_robin
|
| 181 |
+
|
| 182 |
+
#### All Hyperparameters
|
| 183 |
+
<details><summary>Click to expand</summary>
|
| 184 |
+
|
| 185 |
+
- `overwrite_output_dir`: False
|
| 186 |
+
- `do_predict`: False
|
| 187 |
+
- `eval_strategy`: no
|
| 188 |
+
- `prediction_loss_only`: True
|
| 189 |
+
- `per_device_train_batch_size`: 16
|
| 190 |
+
- `per_device_eval_batch_size`: 16
|
| 191 |
+
- `per_gpu_train_batch_size`: None
|
| 192 |
+
- `per_gpu_eval_batch_size`: None
|
| 193 |
+
- `gradient_accumulation_steps`: 1
|
| 194 |
+
- `eval_accumulation_steps`: None
|
| 195 |
+
- `torch_empty_cache_steps`: None
|
| 196 |
+
- `learning_rate`: 5e-05
|
| 197 |
+
- `weight_decay`: 0.0
|
| 198 |
+
- `adam_beta1`: 0.9
|
| 199 |
+
- `adam_beta2`: 0.999
|
| 200 |
+
- `adam_epsilon`: 1e-08
|
| 201 |
+
- `max_grad_norm`: 1
|
| 202 |
+
- `num_train_epochs`: 3
|
| 203 |
+
- `max_steps`: -1
|
| 204 |
+
- `lr_scheduler_type`: linear
|
| 205 |
+
- `lr_scheduler_kwargs`: {}
|
| 206 |
+
- `warmup_ratio`: 0.0
|
| 207 |
+
- `warmup_steps`: 0
|
| 208 |
+
- `log_level`: passive
|
| 209 |
+
- `log_level_replica`: warning
|
| 210 |
+
- `log_on_each_node`: True
|
| 211 |
+
- `logging_nan_inf_filter`: True
|
| 212 |
+
- `save_safetensors`: True
|
| 213 |
+
- `save_on_each_node`: False
|
| 214 |
+
- `save_only_model`: False
|
| 215 |
+
- `restore_callback_states_from_checkpoint`: False
|
| 216 |
+
- `no_cuda`: False
|
| 217 |
+
- `use_cpu`: False
|
| 218 |
+
- `use_mps_device`: False
|
| 219 |
+
- `seed`: 42
|
| 220 |
+
- `data_seed`: None
|
| 221 |
+
- `jit_mode_eval`: False
|
| 222 |
+
- `use_ipex`: False
|
| 223 |
+
- `bf16`: False
|
| 224 |
+
- `fp16`: False
|
| 225 |
+
- `fp16_opt_level`: O1
|
| 226 |
+
- `half_precision_backend`: auto
|
| 227 |
+
- `bf16_full_eval`: False
|
| 228 |
+
- `fp16_full_eval`: False
|
| 229 |
+
- `tf32`: None
|
| 230 |
+
- `local_rank`: 0
|
| 231 |
+
- `ddp_backend`: None
|
| 232 |
+
- `tpu_num_cores`: None
|
| 233 |
+
- `tpu_metrics_debug`: False
|
| 234 |
+
- `debug`: []
|
| 235 |
+
- `dataloader_drop_last`: False
|
| 236 |
+
- `dataloader_num_workers`: 0
|
| 237 |
+
- `dataloader_prefetch_factor`: None
|
| 238 |
+
- `past_index`: -1
|
| 239 |
+
- `disable_tqdm`: False
|
| 240 |
+
- `remove_unused_columns`: True
|
| 241 |
+
- `label_names`: None
|
| 242 |
+
- `load_best_model_at_end`: False
|
| 243 |
+
- `ignore_data_skip`: False
|
| 244 |
+
- `fsdp`: []
|
| 245 |
+
- `fsdp_min_num_params`: 0
|
| 246 |
+
- `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
|
| 247 |
+
- `fsdp_transformer_layer_cls_to_wrap`: None
|
| 248 |
+
- `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
|
| 249 |
+
- `deepspeed`: None
|
| 250 |
+
- `label_smoothing_factor`: 0.0
|
| 251 |
+
- `optim`: adamw_torch
|
| 252 |
+
- `optim_args`: None
|
| 253 |
+
- `adafactor`: False
|
| 254 |
+
- `group_by_length`: False
|
| 255 |
+
- `length_column_name`: length
|
| 256 |
+
- `ddp_find_unused_parameters`: None
|
| 257 |
+
- `ddp_bucket_cap_mb`: None
|
| 258 |
+
- `ddp_broadcast_buffers`: False
|
| 259 |
+
- `dataloader_pin_memory`: True
|
| 260 |
+
- `dataloader_persistent_workers`: False
|
| 261 |
+
- `skip_memory_metrics`: True
|
| 262 |
+
- `use_legacy_prediction_loop`: False
|
| 263 |
+
- `push_to_hub`: False
|
| 264 |
+
- `resume_from_checkpoint`: None
|
| 265 |
+
- `hub_model_id`: None
|
| 266 |
+
- `hub_strategy`: every_save
|
| 267 |
+
- `hub_private_repo`: None
|
| 268 |
+
- `hub_always_push`: False
|
| 269 |
+
- `gradient_checkpointing`: False
|
| 270 |
+
- `gradient_checkpointing_kwargs`: None
|
| 271 |
+
- `include_inputs_for_metrics`: False
|
| 272 |
+
- `include_for_metrics`: []
|
| 273 |
+
- `eval_do_concat_batches`: True
|
| 274 |
+
- `fp16_backend`: auto
|
| 275 |
+
- `push_to_hub_model_id`: None
|
| 276 |
+
- `push_to_hub_organization`: None
|
| 277 |
+
- `mp_parameters`:
|
| 278 |
+
- `auto_find_batch_size`: False
|
| 279 |
+
- `full_determinism`: False
|
| 280 |
+
- `torchdynamo`: None
|
| 281 |
+
- `ray_scope`: last
|
| 282 |
+
- `ddp_timeout`: 1800
|
| 283 |
+
- `torch_compile`: False
|
| 284 |
+
- `torch_compile_backend`: None
|
| 285 |
+
- `torch_compile_mode`: None
|
| 286 |
+
- `dispatch_batches`: None
|
| 287 |
+
- `split_batches`: None
|
| 288 |
+
- `include_tokens_per_second`: False
|
| 289 |
+
- `include_num_input_tokens_seen`: False
|
| 290 |
+
- `neftune_noise_alpha`: None
|
| 291 |
+
- `optim_target_modules`: None
|
| 292 |
+
- `batch_eval_metrics`: False
|
| 293 |
+
- `eval_on_start`: False
|
| 294 |
+
- `use_liger_kernel`: False
|
| 295 |
+
- `eval_use_gather_object`: False
|
| 296 |
+
- `average_tokens_across_devices`: False
|
| 297 |
+
- `prompts`: None
|
| 298 |
+
- `batch_sampler`: batch_sampler
|
| 299 |
+
- `multi_dataset_batch_sampler`: round_robin
|
| 300 |
+
|
| 301 |
+
</details>
|
| 302 |
+
|
| 303 |
+
### Framework Versions
|
| 304 |
+
- Python: 3.9.13
|
| 305 |
+
- Sentence Transformers: 3.3.1
|
| 306 |
+
- Transformers: 4.47.1
|
| 307 |
+
- PyTorch: 2.5.1
|
| 308 |
+
- Accelerate: 1.2.1
|
| 309 |
+
- Datasets: 3.2.0
|
| 310 |
+
- Tokenizers: 0.21.0
|
| 311 |
+
|
| 312 |
+
## Citation
|
| 313 |
+
|
| 314 |
+
### BibTeX
|
| 315 |
+
|
| 316 |
+
#### Sentence Transformers
|
| 317 |
+
```bibtex
|
| 318 |
+
@inproceedings{reimers-2019-sentence-bert,
|
| 319 |
+
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
|
| 320 |
+
author = "Reimers, Nils and Gurevych, Iryna",
|
| 321 |
+
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
|
| 322 |
+
month = "11",
|
| 323 |
+
year = "2019",
|
| 324 |
+
publisher = "Association for Computational Linguistics",
|
| 325 |
+
url = "https://arxiv.org/abs/1908.10084",
|
| 326 |
+
}
|
| 327 |
+
```
|
| 328 |
+
|
| 329 |
+
#### MultipleNegativesRankingLoss
|
| 330 |
+
```bibtex
|
| 331 |
+
@misc{henderson2017efficient,
|
| 332 |
+
title={Efficient Natural Language Response Suggestion for Smart Reply},
|
| 333 |
+
author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
|
| 334 |
+
year={2017},
|
| 335 |
+
eprint={1705.00652},
|
| 336 |
+
archivePrefix={arXiv},
|
| 337 |
+
primaryClass={cs.CL}
|
| 338 |
+
}
|
| 339 |
+
```
|
| 340 |
+
|
| 341 |
+
<!--
|
| 342 |
+
## Glossary
|
| 343 |
+
|
| 344 |
+
*Clearly define terms in order to be accessible across audiences.*
|
| 345 |
+
-->
|
| 346 |
+
|
| 347 |
+
<!--
|
| 348 |
+
## Model Card Authors
|
| 349 |
+
|
| 350 |
+
*Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
|
| 351 |
+
-->
|
| 352 |
+
|
| 353 |
+
<!--
|
| 354 |
+
## Model Card Contact
|
| 355 |
+
|
| 356 |
+
*Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
|
| 357 |
+
-->
|
config.json
ADDED
|
@@ -0,0 +1,26 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"_name_or_path": "sentence-transformers/all-MiniLM-L6-v2",
|
| 3 |
+
"architectures": [
|
| 4 |
+
"BertModel"
|
| 5 |
+
],
|
| 6 |
+
"attention_probs_dropout_prob": 0.1,
|
| 7 |
+
"classifier_dropout": null,
|
| 8 |
+
"gradient_checkpointing": false,
|
| 9 |
+
"hidden_act": "gelu",
|
| 10 |
+
"hidden_dropout_prob": 0.1,
|
| 11 |
+
"hidden_size": 384,
|
| 12 |
+
"initializer_range": 0.02,
|
| 13 |
+
"intermediate_size": 1536,
|
| 14 |
+
"layer_norm_eps": 1e-12,
|
| 15 |
+
"max_position_embeddings": 512,
|
| 16 |
+
"model_type": "bert",
|
| 17 |
+
"num_attention_heads": 12,
|
| 18 |
+
"num_hidden_layers": 6,
|
| 19 |
+
"pad_token_id": 0,
|
| 20 |
+
"position_embedding_type": "absolute",
|
| 21 |
+
"torch_dtype": "float32",
|
| 22 |
+
"transformers_version": "4.47.1",
|
| 23 |
+
"type_vocab_size": 2,
|
| 24 |
+
"use_cache": true,
|
| 25 |
+
"vocab_size": 30522
|
| 26 |
+
}
|
config_sentence_transformers.json
ADDED
|
@@ -0,0 +1,10 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"__version__": {
|
| 3 |
+
"sentence_transformers": "3.3.1",
|
| 4 |
+
"transformers": "4.47.1",
|
| 5 |
+
"pytorch": "2.5.1"
|
| 6 |
+
},
|
| 7 |
+
"prompts": {},
|
| 8 |
+
"default_prompt_name": null,
|
| 9 |
+
"similarity_fn_name": "cosine"
|
| 10 |
+
}
|
model.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:aaba0377a21ba2e77c3c9321916aa2b4cf071cb5d116f1814a2724319a0b9287
|
| 3 |
+
size 90864192
|
modules.json
ADDED
|
@@ -0,0 +1,20 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
[
|
| 2 |
+
{
|
| 3 |
+
"idx": 0,
|
| 4 |
+
"name": "0",
|
| 5 |
+
"path": "",
|
| 6 |
+
"type": "sentence_transformers.models.Transformer"
|
| 7 |
+
},
|
| 8 |
+
{
|
| 9 |
+
"idx": 1,
|
| 10 |
+
"name": "1",
|
| 11 |
+
"path": "1_Pooling",
|
| 12 |
+
"type": "sentence_transformers.models.Pooling"
|
| 13 |
+
},
|
| 14 |
+
{
|
| 15 |
+
"idx": 2,
|
| 16 |
+
"name": "2",
|
| 17 |
+
"path": "2_Normalize",
|
| 18 |
+
"type": "sentence_transformers.models.Normalize"
|
| 19 |
+
}
|
| 20 |
+
]
|
sentence_bert_config.json
ADDED
|
@@ -0,0 +1,4 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"max_seq_length": 256,
|
| 3 |
+
"do_lower_case": false
|
| 4 |
+
}
|
special_tokens_map.json
ADDED
|
@@ -0,0 +1,37 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"cls_token": {
|
| 3 |
+
"content": "[CLS]",
|
| 4 |
+
"lstrip": false,
|
| 5 |
+
"normalized": false,
|
| 6 |
+
"rstrip": false,
|
| 7 |
+
"single_word": false
|
| 8 |
+
},
|
| 9 |
+
"mask_token": {
|
| 10 |
+
"content": "[MASK]",
|
| 11 |
+
"lstrip": false,
|
| 12 |
+
"normalized": false,
|
| 13 |
+
"rstrip": false,
|
| 14 |
+
"single_word": false
|
| 15 |
+
},
|
| 16 |
+
"pad_token": {
|
| 17 |
+
"content": "[PAD]",
|
| 18 |
+
"lstrip": false,
|
| 19 |
+
"normalized": false,
|
| 20 |
+
"rstrip": false,
|
| 21 |
+
"single_word": false
|
| 22 |
+
},
|
| 23 |
+
"sep_token": {
|
| 24 |
+
"content": "[SEP]",
|
| 25 |
+
"lstrip": false,
|
| 26 |
+
"normalized": false,
|
| 27 |
+
"rstrip": false,
|
| 28 |
+
"single_word": false
|
| 29 |
+
},
|
| 30 |
+
"unk_token": {
|
| 31 |
+
"content": "[UNK]",
|
| 32 |
+
"lstrip": false,
|
| 33 |
+
"normalized": false,
|
| 34 |
+
"rstrip": false,
|
| 35 |
+
"single_word": false
|
| 36 |
+
}
|
| 37 |
+
}
|
tokenizer.json
ADDED
|
The diff for this file is too large to render.
See raw diff
|
|
|
tokenizer_config.json
ADDED
|
@@ -0,0 +1,65 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"added_tokens_decoder": {
|
| 3 |
+
"0": {
|
| 4 |
+
"content": "[PAD]",
|
| 5 |
+
"lstrip": false,
|
| 6 |
+
"normalized": false,
|
| 7 |
+
"rstrip": false,
|
| 8 |
+
"single_word": false,
|
| 9 |
+
"special": true
|
| 10 |
+
},
|
| 11 |
+
"100": {
|
| 12 |
+
"content": "[UNK]",
|
| 13 |
+
"lstrip": false,
|
| 14 |
+
"normalized": false,
|
| 15 |
+
"rstrip": false,
|
| 16 |
+
"single_word": false,
|
| 17 |
+
"special": true
|
| 18 |
+
},
|
| 19 |
+
"101": {
|
| 20 |
+
"content": "[CLS]",
|
| 21 |
+
"lstrip": false,
|
| 22 |
+
"normalized": false,
|
| 23 |
+
"rstrip": false,
|
| 24 |
+
"single_word": false,
|
| 25 |
+
"special": true
|
| 26 |
+
},
|
| 27 |
+
"102": {
|
| 28 |
+
"content": "[SEP]",
|
| 29 |
+
"lstrip": false,
|
| 30 |
+
"normalized": false,
|
| 31 |
+
"rstrip": false,
|
| 32 |
+
"single_word": false,
|
| 33 |
+
"special": true
|
| 34 |
+
},
|
| 35 |
+
"103": {
|
| 36 |
+
"content": "[MASK]",
|
| 37 |
+
"lstrip": false,
|
| 38 |
+
"normalized": false,
|
| 39 |
+
"rstrip": false,
|
| 40 |
+
"single_word": false,
|
| 41 |
+
"special": true
|
| 42 |
+
}
|
| 43 |
+
},
|
| 44 |
+
"clean_up_tokenization_spaces": false,
|
| 45 |
+
"cls_token": "[CLS]",
|
| 46 |
+
"do_basic_tokenize": true,
|
| 47 |
+
"do_lower_case": true,
|
| 48 |
+
"extra_special_tokens": {},
|
| 49 |
+
"mask_token": "[MASK]",
|
| 50 |
+
"max_length": 128,
|
| 51 |
+
"model_max_length": 256,
|
| 52 |
+
"never_split": null,
|
| 53 |
+
"pad_to_multiple_of": null,
|
| 54 |
+
"pad_token": "[PAD]",
|
| 55 |
+
"pad_token_type_id": 0,
|
| 56 |
+
"padding_side": "right",
|
| 57 |
+
"sep_token": "[SEP]",
|
| 58 |
+
"stride": 0,
|
| 59 |
+
"strip_accents": null,
|
| 60 |
+
"tokenize_chinese_chars": true,
|
| 61 |
+
"tokenizer_class": "BertTokenizer",
|
| 62 |
+
"truncation_side": "right",
|
| 63 |
+
"truncation_strategy": "longest_first",
|
| 64 |
+
"unk_token": "[UNK]"
|
| 65 |
+
}
|
vocab.txt
ADDED
|
The diff for this file is too large to render.
See raw diff
|
|
|