pilllll commited on
Commit
9a3f42e
·
verified ·
1 Parent(s): 39b5299

Upload 8 files

Browse files
.gitattributes CHANGED
@@ -33,3 +33,4 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
 
 
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
36
+ tokenizer.json filter=lfs diff=lfs merge=lfs -text
README.md ADDED
@@ -0,0 +1,413 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ tags:
3
+ - sentence-transformers
4
+ - sentence-similarity
5
+ - feature-extraction
6
+ - dense
7
+ - generated_from_trainer
8
+ - dataset_size:132830
9
+ - loss:MultipleNegativesRankingLoss
10
+ base_model: intfloat/multilingual-e5-large
11
+ widget:
12
+ - source_sentence: 'query: 伤科方'
13
+ sentences:
14
+ - 'passage: title: 骨伤科效方集 author: Gengmin Tang category: Orthopedics, Medicine formulae,
15
+ receipts, prescriptions, 伤科方 description: '
16
+ - 'passage: title: พูดด้วยภาพ 2 : เทคนิคทำสไลด์เป็นภาพง่าย ๆ ใน 2 ขั้นตอน author:
17
+ สุธาพร ล้ำเลิศกุล. category: Microsoft PowerPoint (Computer file), Presentation
18
+ graphics software, Business presentations, การออกแบบกราฟิก description: จบปัญหา
19
+ "ไม่มีเวลา" และ "ไม่มีเทคนิค" ในการทำสไลด์ หนังสือ "พูดด้วยภาพ 2 : ทำสไลด์เป็นภาพง่าย
20
+ ๆ ใน 2 ขั้นตอน" เล่มนี้ จะสอนให้คุณคิดและทำสไลด์อย่างมีระบบใน 2 ขั้นตอน โดยคุณสามารถเลือกเรียนรู้เฉพาะบท
21
+ และลงมือทำได้แบบไม่จำเป็นต้องอ่านตั้งแต่ต้นจนจบ ย่อยข้อมูล "ยาก" ให้เป็น "ภาพ"
22
+ ที่เข้าใจง่าย พร้อม Link Youtube Video สอนในเล่ม ลด ขั้นตอน เพิ่ม ความแตกต่าง
23
+ ทำสไลด์ให้ สนุก สวยงาม และสื่อสารให้เกิดประโยชน์สูงสุดแก่ผู้ฟัง ตามแบบฉบับของ
24
+ "BetterPitch" สถาบันสอนการทำสไลด์ในองค์กรชั้นนำทั่วประเทศ!'
25
+ - 'passage: title: 福慧之道 author: Yinai Sun category: Happiness, Well-being, Conduct
26
+ of life, Human comfort, Bonheur, Bien-être, Morale pratique, ethics (philosophical
27
+ concept), comfort (sensation), Fo jiao Ren sheng zhe xue Tong su du wu description:
28
+ Ben shu shi dui zheng ge zhong hua wen hua de zong jie, jiang shu ji fu ji hui
29
+ de fang fa. nei rong bao gua : fu mai yu hui mai : ren sheng de xing fu er mai
30
+ ; ru he jie fu hui er mai ; cai fu fu tian ; zhi hui fu tian ; fu tian fa ze ;
31
+ ri xing yi shan ; fu hui ren sheng'
32
+ - source_sentence: 'query: แนะนำหนังสือการจัดการธุรกิจ'
33
+ sentences:
34
+ - 'passage: title: กุญแจ 5 ดอก ขจัดข้อขัดแย้งในองค์งาน author: ปรีชา ทิวะหุต. category:
35
+ การจัดการธุรกิจ, การจัดองค์การ description: '
36
+ - 'passage: title: มานุษยวิทยากายภาพ : วิวัฒนาการทางกายภาพและวัฒนธรรม author: งามพิศ
37
+ สัตย์สงวน category: มานุษยวิทยา, มนุษย์กับวัฒนธรรม, มนุษยวิทยากายภาพ description: '
38
+ - 'passage: title: พีระพงศ์อนุสรณ์ author: N/A category: Birabhongse Kasemsri, M.L.,
39
+ 1935-2000, Diplomats Thailand Biography, ชีวประวัติ, หนังสืออนุสรณ์งานศพ description:
40
+ ในวาระครบ 5 ปีแห่งการถึงแก่อนิจกรรมของหม่อมหลวงพีระพงศ์ เกษมศรี ครอบครัวของหม่อมหลวงพีระพงศ์ฯ
41
+ ได้จัดทำหนังสือ "พีระพงศ์อนุสรณ์" เป็นเครื่องสำนึกถึงชีวิตและงานของหม่อมหลวงพีระพงศ์ฯ
42
+ จุดมุ่งประสงค์เหนือสิ่งอื่นใดของหนังสือนี้ ก็เพื่อเป็นอนุสรณ์ถึงความจงรักภักดีอุทิศตนถวายของหม่อมหลวงพีระพงศ์ฯ
43
+ ต่อสถาบันพระมหากษัตริย์ ต่อพระบรมราชจักรีวงศ์ และต่อองคืพระบาทสมเด็จพระเจ้าอยู่หัวและสมเด็จพระนางเจ้าฯพระบรมราชินีนาถ
44
+ ตลอดจนพระบรมวงศานุวงศ์ทุกพระองค์'
45
+ - source_sentence: 'query: เริ่มต้นManipulation, Orthopedicควรอ่านอะไร'
46
+ sentences:
47
+ - 'passage: title: 實用筋膜操作指引 = A practical guide to fascial manipulation author:
48
+ 盧奧馬拉 (Luomala, Tuulia), 文字作者 category: Manipulation (Therapeutics), Fasciae (Anatomy),
49
+ Manipulation, Orthopedic, Fascia, Manipulation (Thérapeutique), Ji jin mo fang
50
+ song shu description: '
51
+ - 'passage: title: Opioid sensitivity of chronic noncancer pain author: Eija Kalso
52
+ category: Opioids Therapeutic use Congresses, Chronic pain Chemotherapy Congresses,
53
+ Opioids Receptors Congresses, Pain drug therapy, Analgesics, Opioid therapeutic
54
+ use, Chronic Disease drug therapy, Receptors, Opioid physiology, Douleur chronique
55
+ Chimiothérapie Congrès, Opioïdes Emploi en thérapeutique Congrès, Opioïdes
56
+ Récepteurs Congrès, Opioids Receptors, Opioids Therapeutic use, Analgésiques
57
+ morphiniques usage thérapeutique, Maladie chronique traitement médicamenteux,
58
+ Récepteur endorphine, Chronischer Schmerz, Opioide, Kongress, Opiatrezeptor,
59
+ Analgesie, Opiate, Congress, Conference papers and proceedings, Actes de congrès
60
+ description: Contains papers from the first international research symposium of
61
+ the International Association for the Study of Pain, held in Helsinki, Finland,
62
+ Fall 1998. Focus is on opioid responsiveness to neuropathic pain. Papers are arranged
63
+ in sections on function and dysfunction of opioid receptors, clinical pharmacology
64
+ of opioids, understanding and improving opioid sensitivity, and opioid sensitivity
65
+ of different chronic pain states. Specific topics include targeting of opioid
66
+ receptors to presynaptic sites, route of opioid administration, phenotypic changes
67
+ induced in dorsal root ganglion neurons by nerve injury, and opioids in headache.
68
+ Kalso is currently affiliated with the Karolinska Institute in Sweden. IASP member
69
+ price, $44.85. Annotation copyrighted by Book News, Inc., Portland, OR'
70
+ - 'passage: title: พลิกคัมภีร์ตีแตกเศรษฐกิจไทย = Thailand''s economic outlook 2009
71
+ author: วีระศักดิ์ พงศ์อักษร. category: ปัญหาเศรษฐกิจ ไทย, ไทย ภาวะเศรษฐกิจ, ไทย
72
+ ภาวะสังคม description: '
73
+ - source_sentence: 'query: เริ่มต้นทางรถไฟ ไทย กาญจนบุรีควรอ่านอะไร'
74
+ sentences:
75
+ - 'passage: title: คู่มือคำศัพท์ช่วยเหลือนักท่องเที่ยวเบื้องต้น (ภาษาจีน) พร้อมภาพประกอบ
76
+ author: ชัยพันธุ์ สิทธิสุวรรณกุล category: คำศัพท์, ภาษาจีน คู่มือ, นักท่องเที่ยว
77
+ description: '
78
+ - 'passage: title: ทางรถไฟสายมรณะ author: N/A category: ทางรถไฟ ไทย กาญจนบุรี description: '
79
+ - 'passage: title: ยุทธศาสตร์ชาติว่าด้วยการป้องกันและปราบปรามการทุจริต ระยะที่ 3
80
+ (พ.ศ. 2560-2564) author: คณะกรรมการป้องกันและปราบปรามการทุจริตแห่งชาติ category:
81
+ การทุจริตและประพฤติมิชอบ ไทย, การทุจริตและประพฤติมิชอบในวงราชการ ไทย ยุทธศาสตร์,
82
+ ยุทธศาสตร์, การฉ้อราษฎร์บังหลวง ไทย การป้องกัน description: '
83
+ - source_sentence: 'query: หนังสือนิทาน'
84
+ sentences:
85
+ - 'passage: title: เด็กหญิงข้าวเปลือก author: หยาดฝน ธัญโชติกานต์. category: นิทาน
86
+ description: '
87
+ - 'passage: title: Current drug discovery technologies author: N/A category: Drugs
88
+ Design Periodicals, Pharmaceutical technology Periodicals, Drug Design, Technology,
89
+ Pharmaceutical, Drugs Design, Pharmaceutical technology, Periodicals description: '
90
+ - 'passage: title: 汉语词汇・句法・语音的相互关联 : 第二届肯特岗国际汉语语言学圆桌会议论文集 = Interface in Chinese
91
+ : morphology, syntax and phonetics author: Kent Ridge International Roundtable
92
+ Conference on Chinese linguistics category: Chinese language Grammar Congresses,
93
+ Chinese language Congresses, Chinois (Langue) Grammaire Congrès, Chinois (Langue)
94
+ Congrès, Han yu yu yan xue guo ji xue shu hui yi hui yi lu, Chinese language,
95
+ Chinese language Grammar, Conference papers and proceedings, Conversation and
96
+ phrase books description: '
97
+ pipeline_tag: sentence-similarity
98
+ library_name: sentence-transformers
99
+ ---
100
+
101
+ # SentenceTransformer based on intfloat/multilingual-e5-large
102
+
103
+ This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [intfloat/multilingual-e5-large](https://huggingface.co/intfloat/multilingual-e5-large). It maps sentences & paragraphs to a 1024-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
104
+
105
+ ## Model Details
106
+
107
+ ### Model Description
108
+ - **Model Type:** Sentence Transformer
109
+ - **Base model:** [intfloat/multilingual-e5-large](https://huggingface.co/intfloat/multilingual-e5-large) <!-- at revision 3d7cfbdacd47fdda877c5cd8a79fbcc4f2a574f3 -->
110
+ - **Maximum Sequence Length:** 512 tokens
111
+ - **Output Dimensionality:** 1024 dimensions
112
+ - **Similarity Function:** Cosine Similarity
113
+ <!-- - **Training Dataset:** Unknown -->
114
+ <!-- - **Language:** Unknown -->
115
+ <!-- - **License:** Unknown -->
116
+
117
+ ### Model Sources
118
+
119
+ - **Documentation:** [Sentence Transformers Documentation](https://sbert.net)
120
+ - **Repository:** [Sentence Transformers on GitHub](https://github.com/huggingface/sentence-transformers)
121
+ - **Hugging Face:** [Sentence Transformers on Hugging Face](https://huggingface.co/models?library=sentence-transformers)
122
+
123
+ ### Full Model Architecture
124
+
125
+ ```
126
+ SentenceTransformer(
127
+ (0): Transformer({'max_seq_length': 512, 'do_lower_case': False, 'architecture': 'XLMRobertaModel'})
128
+ (1): Pooling({'word_embedding_dimension': 1024, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
129
+ (2): Normalize()
130
+ )
131
+ ```
132
+
133
+ ## Usage
134
+
135
+ ### Direct Usage (Sentence Transformers)
136
+
137
+ First install the Sentence Transformers library:
138
+
139
+ ```bash
140
+ pip install -U sentence-transformers
141
+ ```
142
+
143
+ Then you can load this model and run inference.
144
+ ```python
145
+ from sentence_transformers import SentenceTransformer
146
+
147
+ # Download from the 🤗 Hub
148
+ model = SentenceTransformer("sentence_transformers_model_id")
149
+ # Run inference
150
+ sentences = [
151
+ 'query: หนังสือนิทาน',
152
+ 'passage: title: เด็กหญิงข้าวเปลือก author: หยาดฝน ธัญโชติกานต์. category: นิทาน description: ',
153
+ 'passage: title: Current drug discovery technologies author: N/A category: Drugs Design Periodicals, Pharmaceutical technology Periodicals, Drug Design, Technology, Pharmaceutical, Drugs Design, Pharmaceutical technology, Periodicals description: ',
154
+ ]
155
+ embeddings = model.encode(sentences)
156
+ print(embeddings.shape)
157
+ # [3, 1024]
158
+
159
+ # Get the similarity scores for the embeddings
160
+ similarities = model.similarity(embeddings, embeddings)
161
+ print(similarities)
162
+ # tensor([[ 1.0000, 0.7672, -0.0610],
163
+ # [ 0.7672, 1.0000, 0.0661],
164
+ # [-0.0610, 0.0661, 1.0000]])
165
+ ```
166
+
167
+ <!--
168
+ ### Direct Usage (Transformers)
169
+
170
+ <details><summary>Click to see the direct usage in Transformers</summary>
171
+
172
+ </details>
173
+ -->
174
+
175
+ <!--
176
+ ### Downstream Usage (Sentence Transformers)
177
+
178
+ You can finetune this model on your own dataset.
179
+
180
+ <details><summary>Click to expand</summary>
181
+
182
+ </details>
183
+ -->
184
+
185
+ <!--
186
+ ### Out-of-Scope Use
187
+
188
+ *List how the model may foreseeably be misused and address what users ought not to do with the model.*
189
+ -->
190
+
191
+ <!--
192
+ ## Bias, Risks and Limitations
193
+
194
+ *What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
195
+ -->
196
+
197
+ <!--
198
+ ### Recommendations
199
+
200
+ *What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
201
+ -->
202
+
203
+ ## Training Details
204
+
205
+ ### Training Dataset
206
+
207
+ #### Unnamed Dataset
208
+
209
+ * Size: 132,830 training samples
210
+ * Columns: <code>sentence_0</code>, <code>sentence_1</code>, and <code>sentence_2</code>
211
+ * Approximate statistics based on the first 1000 samples:
212
+ | | sentence_0 | sentence_1 | sentence_2 |
213
+ |:--------|:----------------------------------------------------------------------------------|:------------------------------------------------------------------------------------|:------------------------------------------------------------------------------------|
214
+ | type | string | string | string |
215
+ | details | <ul><li>min: 6 tokens</li><li>mean: 14.35 tokens</li><li>max: 38 tokens</li></ul> | <ul><li>min: 17 tokens</li><li>mean: 90.98 tokens</li><li>max: 512 tokens</li></ul> | <ul><li>min: 19 tokens</li><li>mean: 87.53 tokens</li><li>max: 512 tokens</li></ul> |
216
+ * Samples:
217
+ | sentence_0 | sentence_1 | sentence_2 |
218
+ |:------------------------------------------------------------|:-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
219
+ | <code>query: ไสยศาสตร์ สำหรับมือใหม่</code> | <code>passage: title: สถานการณ์พระพุทธศาสนา : กระแสไสยศาสตร์ author: พระธรรมปิฎก (ป.อ. ปยุตฺโต) category: ไสยศาสตร์, พุทธศาสนากับไสยศาสตร์ description: </code> | <code>passage: title: Hospitality marketing management author: Robert D. Reid category: Hospitality industry Marketing, Food service Marketing, Restaurants Marketing, Accueil (Tourisme) Marketing, Services alimentaires Marketing, Marketing, Tiếp thị, Hospitality industry, Khách sạn, Dịch vụ ăn uống, Restaurants, Quán ăn description: </code> |
220
+ | <code>query: 伤科方</code> | <code>passage: title: 骨伤科效方集 author: Gengmin Tang category: Orthopedics, Medicine formulae, receipts, prescriptions, 伤科方 description: </code> | <code>passage: title: 福慧之道 author: Yinai Sun category: Happiness, Well-being, Conduct of life, Human comfort, Bonheur, Bien-être, Morale pratique, ethics (philosophical concept), comfort (sensation), Fo jiao Ren sheng zhe xue Tong su du wu description: Ben shu shi dui zheng ge zhong hua wen hua de zong jie, jiang shu ji fu ji hui de fang fa. nei rong bao gua : fu mai yu hui mai : ren sheng de xing fu er mai ; ru he jie fu hui er mai ; cai fu fu tian ; zhi hui fu tian ; fu tian fa ze ; ri xing yi shan ; fu hui ren sheng</code> |
221
+ | <code>query: basic Acid-Base Imbalance problems book</code> | <code>passage: title: Acid-base, fluids, and electrolytes made ridiculously simple author: Richard A. Preston category: Acid-Base Imbalance problems, Body Fluids problems, Water-Electrolyte Imbalance problems, Water-electrolyte imbalance description: </code> | <code>passage: title: Fetal and neonatal neurology and neurosurgery author: Malcolm I. Levene category: Brain Diseases, Newborn infants, Nervous system Surgery, Nervous system Diseases, Brain embryology, Fetal Diseases therapy, Infant, Newborn, Neurosurgery, Prenatal Diagnosis methods, Ultrasonography methods, Neurosurgical Procedures, Cerveau Maladies, Nouveau-nés, Neurochirurgie, Système nerveux Maladies description: </code> |
222
+ * Loss: [<code>MultipleNegativesRankingLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#multiplenegativesrankingloss) with these parameters:
223
+ ```json
224
+ {
225
+ "scale": 20.0,
226
+ "similarity_fct": "cos_sim",
227
+ "gather_across_devices": false,
228
+ "directions": [
229
+ "query_to_doc"
230
+ ],
231
+ "partition_mode": "joint",
232
+ "hardness_mode": null,
233
+ "hardness_strength": 0.0
234
+ }
235
+ ```
236
+
237
+ ### Training Hyperparameters
238
+ #### Non-Default Hyperparameters
239
+
240
+ - `per_device_train_batch_size`: 64
241
+ - `per_device_eval_batch_size`: 64
242
+ - `num_train_epochs`: 1
243
+ - `fp16`: True
244
+ - `multi_dataset_batch_sampler`: round_robin
245
+
246
+ #### All Hyperparameters
247
+ <details><summary>Click to expand</summary>
248
+
249
+ - `do_predict`: False
250
+ - `eval_strategy`: no
251
+ - `prediction_loss_only`: True
252
+ - `per_device_train_batch_size`: 64
253
+ - `per_device_eval_batch_size`: 64
254
+ - `gradient_accumulation_steps`: 1
255
+ - `eval_accumulation_steps`: None
256
+ - `torch_empty_cache_steps`: None
257
+ - `learning_rate`: 5e-05
258
+ - `weight_decay`: 0.0
259
+ - `adam_beta1`: 0.9
260
+ - `adam_beta2`: 0.999
261
+ - `adam_epsilon`: 1e-08
262
+ - `max_grad_norm`: 1
263
+ - `num_train_epochs`: 1
264
+ - `max_steps`: -1
265
+ - `lr_scheduler_type`: linear
266
+ - `lr_scheduler_kwargs`: None
267
+ - `warmup_ratio`: None
268
+ - `warmup_steps`: 0
269
+ - `log_level`: passive
270
+ - `log_level_replica`: warning
271
+ - `log_on_each_node`: True
272
+ - `logging_nan_inf_filter`: True
273
+ - `enable_jit_checkpoint`: False
274
+ - `save_on_each_node`: False
275
+ - `save_only_model`: False
276
+ - `restore_callback_states_from_checkpoint`: False
277
+ - `use_cpu`: False
278
+ - `seed`: 42
279
+ - `data_seed`: None
280
+ - `bf16`: False
281
+ - `fp16`: True
282
+ - `bf16_full_eval`: False
283
+ - `fp16_full_eval`: False
284
+ - `tf32`: None
285
+ - `local_rank`: -1
286
+ - `ddp_backend`: None
287
+ - `debug`: []
288
+ - `dataloader_drop_last`: False
289
+ - `dataloader_num_workers`: 0
290
+ - `dataloader_prefetch_factor`: None
291
+ - `disable_tqdm`: False
292
+ - `remove_unused_columns`: True
293
+ - `label_names`: None
294
+ - `load_best_model_at_end`: False
295
+ - `ignore_data_skip`: False
296
+ - `fsdp`: []
297
+ - `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
298
+ - `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
299
+ - `parallelism_config`: None
300
+ - `deepspeed`: None
301
+ - `label_smoothing_factor`: 0.0
302
+ - `optim`: adamw_torch_fused
303
+ - `optim_args`: None
304
+ - `group_by_length`: False
305
+ - `length_column_name`: length
306
+ - `project`: huggingface
307
+ - `trackio_space_id`: trackio
308
+ - `ddp_find_unused_parameters`: None
309
+ - `ddp_bucket_cap_mb`: None
310
+ - `ddp_broadcast_buffers`: False
311
+ - `dataloader_pin_memory`: True
312
+ - `dataloader_persistent_workers`: False
313
+ - `skip_memory_metrics`: True
314
+ - `push_to_hub`: False
315
+ - `resume_from_checkpoint`: None
316
+ - `hub_model_id`: None
317
+ - `hub_strategy`: every_save
318
+ - `hub_private_repo`: None
319
+ - `hub_always_push`: False
320
+ - `hub_revision`: None
321
+ - `gradient_checkpointing`: False
322
+ - `gradient_checkpointing_kwargs`: None
323
+ - `include_for_metrics`: []
324
+ - `eval_do_concat_batches`: True
325
+ - `auto_find_batch_size`: False
326
+ - `full_determinism`: False
327
+ - `ddp_timeout`: 1800
328
+ - `torch_compile`: False
329
+ - `torch_compile_backend`: None
330
+ - `torch_compile_mode`: None
331
+ - `include_num_input_tokens_seen`: no
332
+ - `neftune_noise_alpha`: None
333
+ - `optim_target_modules`: None
334
+ - `batch_eval_metrics`: False
335
+ - `eval_on_start`: False
336
+ - `use_liger_kernel`: False
337
+ - `liger_kernel_config`: None
338
+ - `eval_use_gather_object`: False
339
+ - `average_tokens_across_devices`: True
340
+ - `use_cache`: False
341
+ - `prompts`: None
342
+ - `batch_sampler`: batch_sampler
343
+ - `multi_dataset_batch_sampler`: round_robin
344
+ - `router_mapping`: {}
345
+ - `learning_rate_mapping`: {}
346
+
347
+ </details>
348
+
349
+ ### Training Logs
350
+ | Epoch | Step | Training Loss |
351
+ |:------:|:----:|:-------------:|
352
+ | 0.2408 | 500 | 0.4763 |
353
+ | 0.4817 | 1000 | 0.1799 |
354
+ | 0.7225 | 1500 | 0.1731 |
355
+ | 0.9634 | 2000 | 0.1628 |
356
+
357
+
358
+ ### Framework Versions
359
+ - Python: 3.12.13
360
+ - Sentence Transformers: 5.3.0
361
+ - Transformers: 5.0.0
362
+ - PyTorch: 2.10.0+cu128
363
+ - Accelerate: 1.13.0
364
+ - Datasets: 4.0.0
365
+ - Tokenizers: 0.22.2
366
+
367
+ ## Citation
368
+
369
+ ### BibTeX
370
+
371
+ #### Sentence Transformers
372
+ ```bibtex
373
+ @inproceedings{reimers-2019-sentence-bert,
374
+ title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
375
+ author = "Reimers, Nils and Gurevych, Iryna",
376
+ booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
377
+ month = "11",
378
+ year = "2019",
379
+ publisher = "Association for Computational Linguistics",
380
+ url = "https://arxiv.org/abs/1908.10084",
381
+ }
382
+ ```
383
+
384
+ #### MultipleNegativesRankingLoss
385
+ ```bibtex
386
+ @misc{oord2019representationlearningcontrastivepredictive,
387
+ title={Representation Learning with Contrastive Predictive Coding},
388
+ author={Aaron van den Oord and Yazhe Li and Oriol Vinyals},
389
+ year={2019},
390
+ eprint={1807.03748},
391
+ archivePrefix={arXiv},
392
+ primaryClass={cs.LG},
393
+ url={https://arxiv.org/abs/1807.03748},
394
+ }
395
+ ```
396
+
397
+ <!--
398
+ ## Glossary
399
+
400
+ *Clearly define terms in order to be accessible across audiences.*
401
+ -->
402
+
403
+ <!--
404
+ ## Model Card Authors
405
+
406
+ *Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
407
+ -->
408
+
409
+ <!--
410
+ ## Model Card Contact
411
+
412
+ *Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
413
+ -->
config.json ADDED
@@ -0,0 +1,30 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "add_cross_attention": false,
3
+ "architectures": [
4
+ "XLMRobertaModel"
5
+ ],
6
+ "attention_probs_dropout_prob": 0.1,
7
+ "bos_token_id": 0,
8
+ "classifier_dropout": null,
9
+ "dtype": "float32",
10
+ "eos_token_id": 2,
11
+ "hidden_act": "gelu",
12
+ "hidden_dropout_prob": 0.1,
13
+ "hidden_size": 1024,
14
+ "initializer_range": 0.02,
15
+ "intermediate_size": 4096,
16
+ "is_decoder": false,
17
+ "layer_norm_eps": 1e-05,
18
+ "max_position_embeddings": 514,
19
+ "model_type": "xlm-roberta",
20
+ "num_attention_heads": 16,
21
+ "num_hidden_layers": 24,
22
+ "output_past": true,
23
+ "pad_token_id": 1,
24
+ "position_embedding_type": "absolute",
25
+ "tie_word_embeddings": true,
26
+ "transformers_version": "5.0.0",
27
+ "type_vocab_size": 1,
28
+ "use_cache": true,
29
+ "vocab_size": 250002
30
+ }
config_sentence_transformers.json ADDED
@@ -0,0 +1,14 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "model_type": "SentenceTransformer",
3
+ "__version__": {
4
+ "sentence_transformers": "5.3.0",
5
+ "transformers": "5.0.0",
6
+ "pytorch": "2.10.0+cu128"
7
+ },
8
+ "prompts": {
9
+ "query": "",
10
+ "document": ""
11
+ },
12
+ "default_prompt_name": null,
13
+ "similarity_fn_name": "cosine"
14
+ }
model-002.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:7ebd2ec3ab0d42110d44ce5907bb602811983ee0a939551b8877552b5fd8dfe7
3
+ size 2239607120
modules.json ADDED
@@ -0,0 +1,20 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ [
2
+ {
3
+ "idx": 0,
4
+ "name": "0",
5
+ "path": "",
6
+ "type": "sentence_transformers.models.Transformer"
7
+ },
8
+ {
9
+ "idx": 1,
10
+ "name": "1",
11
+ "path": "1_Pooling",
12
+ "type": "sentence_transformers.models.Pooling"
13
+ },
14
+ {
15
+ "idx": 2,
16
+ "name": "2",
17
+ "path": "2_Normalize",
18
+ "type": "sentence_transformers.models.Normalize"
19
+ }
20
+ ]
sentence_bert_config.json ADDED
@@ -0,0 +1,4 @@
 
 
 
 
 
1
+ {
2
+ "max_seq_length": 512,
3
+ "do_lower_case": false
4
+ }
tokenizer.json ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:3fe715a86a37cd2b20e5eaeee8b22815bce65de676d1e0cd856114b59dab67fc
3
+ size 16766387
tokenizer_config.json ADDED
@@ -0,0 +1,15 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "add_prefix_space": true,
3
+ "backend": "tokenizers",
4
+ "bos_token": "<s>",
5
+ "clean_up_tokenization_spaces": true,
6
+ "cls_token": "<s>",
7
+ "eos_token": "</s>",
8
+ "is_local": false,
9
+ "mask_token": "<mask>",
10
+ "model_max_length": 512,
11
+ "pad_token": "<pad>",
12
+ "sep_token": "</s>",
13
+ "tokenizer_class": "XLMRobertaTokenizer",
14
+ "unk_token": "<unk>"
15
+ }