Rsr2425 commited on
Commit
37aabf4
·
verified ·
1 Parent(s): 80e0746

Add new SentenceTransformer model

Browse files
1_Pooling/config.json ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "word_embedding_dimension": 1024,
3
+ "pooling_mode_cls_token": true,
4
+ "pooling_mode_mean_tokens": false,
5
+ "pooling_mode_max_tokens": false,
6
+ "pooling_mode_mean_sqrt_len_tokens": false,
7
+ "pooling_mode_weightedmean_tokens": false,
8
+ "pooling_mode_lasttoken": false,
9
+ "include_prompt": true
10
+ }
README.md ADDED
@@ -0,0 +1,621 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ tags:
3
+ - sentence-transformers
4
+ - sentence-similarity
5
+ - feature-extraction
6
+ - generated_from_trainer
7
+ - dataset_size:64
8
+ - loss:MatryoshkaLoss
9
+ - loss:MultipleNegativesRankingLoss
10
+ base_model: Snowflake/snowflake-arctic-embed-l
11
+ widget:
12
+ - source_sentence: '1. What is the first step to take when implementing architecture
13
+ as code according to the provided context?
14
+
15
+ 2. How should the content of each file be formatted when outputting code?'
16
+ sentences:
17
+ - architecture is, in the end, implemented as code.\\n\\nThink step by step and
18
+ reason yourself to the right decisions to make sure we get it right.\\nYou will
19
+ first lay out the names of the core classes, functions, methods that will be necessary,
20
+ as well as a quick comment on their purpose.\\n\\nThen you will output the content
21
+ of each file including ALL code.\\nEach file must strictly follow a markdown code
22
+ block format, where the following tokens must be replaced such that\\nFILENAME
23
+ is the lowercase file name including the file extension,\\nLANG is the markup
24
+ code block language for the code\'s language, and CODE is the code:\\n\\nFILENAME\\n\`\`\`LANG\\nCODE\\n\`\`\`\\n\\nYou
25
+ will start with the \\"entrypoint\\" file, then go to the
26
+ - 'Stream tokens:
27
+
28
+ for message, metadata in graph.stream( {"question": "What is Task Decomposition?"},
29
+ stream_mode="messages"): print(message.content, end="|")
30
+
31
+ |Task| decomposition| is| the| process| of| breaking| down| complex| tasks| into|
32
+ smaller|,| more| manageable| steps|.| It| can| be| achieved| through| techniques|
33
+ like| Chain| of| Thought| (|Co|T|)| prompting|,| which| encourages| the| model|
34
+ to| think| step| by| step|,| or| through| more| structured| methods| like| the|
35
+ Tree| of| Thoughts|.| This| approach| not| only| simplifies| task| execution|
36
+ but| also| provides| insights| into| the| model|''s| reasoning| process|.||
37
+
38
+ tipFor async invocations, use:result = await graph.ainvoke(...)andasync for step
39
+ in graph.astream(...):'
40
+ - 'return {"answer": response.content}graph_builder = StateGraph(State).add_sequence([analyze_query,
41
+ retrieve, generate])graph_builder.add_edge(START, "analyze_query")graph = graph_builder.compile()'
42
+ - source_sentence: "1. What is the purpose of the DocumentTransformer object in the\
43
+ \ context provided? \n2. Where can one find detailed documentation on how to\
44
+ \ use DocumentTransformers?"
45
+ sentences:
46
+ - 'Learn more about splitting text using different methods by reading the how-to
47
+ docs
48
+
49
+ Code (py or js)
50
+
51
+ Scientific papers
52
+
53
+ Interface: API reference for the base interface.
54
+
55
+
56
+ DocumentTransformer: Object that performs a transformation on a list
57
+
58
+ of Document objects.
59
+
60
+
61
+ Docs: Detailed documentation on how to use DocumentTransformers
62
+
63
+ Integrations
64
+
65
+ Interface: API reference for the base interface.'
66
+ - '{''retrieve'': {''context'': [Document(id=''a42dc78b-8f76-472a-9e25-180508af74f3'',
67
+ metadata={''source'': ''https://lilianweng.github.io/posts/2023-06-23-agent/'',
68
+ ''start_index'': 1585}, page_content=''Fig. 1. Overview of a LLM-powered autonomous
69
+ agent system.\nComponent One: Planning#\nA complicated task usually involves many
70
+ steps. An agent needs to know what they are and plan ahead.\nTask Decomposition#\nChain
71
+ of thought (CoT; Wei et al. 2022) has become a standard prompting technique for
72
+ enhancing model performance on complex tasks. The model is instructed to “think
73
+ step by step” to utilize more test-time computation to decompose hard tasks into
74
+ smaller and simpler steps. CoT transforms big tasks into multiple manageable tasks
75
+ and shed lights into'
76
+ - 'Do I need to use LangGraph?LangGraph is not required to build a RAG application.
77
+ Indeed, we can implement the same application logic through invocations of the
78
+ individual components:question = "..."retrieved_docs = vector_store.similarity_search(question)docs_content
79
+ = "\n\n".join(doc.page_content for doc in retrieved_docs)prompt = prompt.invoke({"question":
80
+ question, "context": docs_content})answer = llm.invoke(prompt)The benefits of
81
+ LangGraph include:
82
+
83
+ Support for multiple invocation modes: this logic would need to be rewritten if
84
+ we wanted to stream output tokens, or stream the results of individual steps;
85
+
86
+ Automatic support for tracing via LangSmith and deployments via LangGraph Platform;'
87
+ - source_sentence: '1. What mode did the agent move into after the clarifications
88
+ were made?
89
+
90
+ 2. What instructions were given to the agent regarding the code writing process?'
91
+ sentences:
92
+ - '= RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=200)all_splits
93
+ = text_splitter.split_documents(docs)# Update metadata (illustration purposes)total_documents
94
+ = len(all_splits)third = total_documents // 3for i, document in enumerate(all_splits): if
95
+ i < third: document.metadata["section"] = "beginning" elif i < 2 * third: document.metadata["section"]
96
+ = "middle" else: document.metadata["section"] = "end"# Index chunksvector_store
97
+ = InMemoryVectorStore(embeddings)_ = vector_store.add_documents(all_splits)# Define
98
+ schema for searchclass Search(TypedDict): """Search query.""" query: Annotated[str,
99
+ ..., "Search query to run."] section: Annotated[ Literal["beginning",
100
+ "middle", "end"],'
101
+ - 'limitations:''), Document(id=''ca7f06e4-2c2e-4788-9a81-2418d82213d9'', metadata={''source'':
102
+ ''https://lilianweng.github.io/posts/2023-06-23-agent/'', ''start_index'': 32942,
103
+ ''section'': ''end''}, page_content=''}\n]\nThen after these clarification, the
104
+ agent moved into the code writing mode with a different system message.\nSystem
105
+ message:''), Document(id=''1fcc2736-30f4-4ef6-90f2-c64af92118cb'', metadata={''source'':
106
+ ''https://lilianweng.github.io/posts/2023-06-23-agent/'', ''start_index'': 35127,
107
+ ''section'': ''end''}, page_content=''"content": "You will get instructions for
108
+ code to write.\\nYou will write a very long answer. Make sure that every detail
109
+ of the architecture is, in the end, implemented as code.\\nMake sure that every
110
+ detail of the architecture is,'
111
+ - 'Build a Retrieval Augmented Generation (RAG) App: Part 1 | 🦜️🔗 LangChain'
112
+ - source_sentence: '1. What is the purpose of the `getpass` module in the provided
113
+ context?
114
+
115
+ 2. How is the chat model initialized in the given code snippet?'
116
+ sentences:
117
+ - 'Select chat model:Groq▾GroqOpenAIAnthropicAzureGoogle VertexAWSCohereNVIDIAFireworks
118
+ AIMistral AITogether AIIBM watsonxDatabrickspip install -qU "langchain[groq]"import
119
+ getpassimport osif not os.environ.get("GROQ_API_KEY"): os.environ["GROQ_API_KEY"]
120
+ = getpass.getpass("Enter API key for Groq: ")from langchain.chat_models import
121
+ init_chat_modelllm = init_chat_model("llama3-8b-8192", model_provider="groq")'
122
+ - 'One of the most powerful applications enabled by LLMs is sophisticated question-answering
123
+ (Q&A) chatbots. These are applications that can answer questions about specific
124
+ source information. These applications use a technique known as Retrieval Augmented
125
+ Generation, or RAG.
126
+
127
+ This is a multi-part tutorial:'
128
+ - 'user''s request in a straightforward manner. Then describe the task process and
129
+ show your analysis and model inference results to the user in the first person.
130
+ If inference results contain a file path, must tell the user the complete file
131
+ path.")]}}----------------{''generate'': {''answer'': ''Task decomposition is
132
+ the process of breaking down a complex task into smaller, more manageable steps.
133
+ This technique, often enhanced by methods like Chain of Thought (CoT) or Tree
134
+ of Thoughts, allows models to reason through tasks systematically and improves
135
+ performance by clarifying the thought process. It can be achieved through simple
136
+ prompts, task-specific instructions, or human inputs.''}}----------------'
137
+ - source_sentence: "1. How do chat models utilize the state of the graph to recover\
138
+ \ sources for generated answers? \n2. What is the significance of the \"context\"\
139
+ \ field in the state when returning sources?"
140
+ sentences:
141
+ - 'Docs: Detailed documentation on how to use embeddings.
142
+
143
+ Integrations: 30+ integrations to choose from.
144
+
145
+ Interface: API reference for the base interface.
146
+
147
+
148
+ VectorStore: Wrapper around a vector database, used for storing and
149
+
150
+ querying embeddings.
151
+
152
+
153
+ Docs: Detailed documentation on how to use vector stores.
154
+
155
+ Integrations: 40+ integrations to choose from.
156
+
157
+ Interface: API reference for the base interface.'
158
+ - 'Returning sources​
159
+
160
+ Note that by storing the retrieved context in the state of the graph, we recover
161
+ sources for the model''s generated answer in the "context" field of the state.
162
+ See this guide on returning sources for more detail.
163
+
164
+ Go deeper​
165
+
166
+ Chat models take in a sequence of messages and return a message.'
167
+ - display(Image(graph.get_graph().draw_mermaid_png()))
168
+ pipeline_tag: sentence-similarity
169
+ library_name: sentence-transformers
170
+ metrics:
171
+ - cosine_accuracy@1
172
+ - cosine_accuracy@3
173
+ - cosine_accuracy@5
174
+ - cosine_accuracy@10
175
+ - cosine_precision@1
176
+ - cosine_precision@3
177
+ - cosine_precision@5
178
+ - cosine_precision@10
179
+ - cosine_recall@1
180
+ - cosine_recall@3
181
+ - cosine_recall@5
182
+ - cosine_recall@10
183
+ - cosine_ndcg@10
184
+ - cosine_mrr@10
185
+ - cosine_map@100
186
+ model-index:
187
+ - name: SentenceTransformer based on Snowflake/snowflake-arctic-embed-l
188
+ results:
189
+ - task:
190
+ type: information-retrieval
191
+ name: Information Retrieval
192
+ dataset:
193
+ name: Unknown
194
+ type: unknown
195
+ metrics:
196
+ - type: cosine_accuracy@1
197
+ value: 1.0
198
+ name: Cosine Accuracy@1
199
+ - type: cosine_accuracy@3
200
+ value: 1.0
201
+ name: Cosine Accuracy@3
202
+ - type: cosine_accuracy@5
203
+ value: 1.0
204
+ name: Cosine Accuracy@5
205
+ - type: cosine_accuracy@10
206
+ value: 1.0
207
+ name: Cosine Accuracy@10
208
+ - type: cosine_precision@1
209
+ value: 1.0
210
+ name: Cosine Precision@1
211
+ - type: cosine_precision@3
212
+ value: 0.3333333333333333
213
+ name: Cosine Precision@3
214
+ - type: cosine_precision@5
215
+ value: 0.2
216
+ name: Cosine Precision@5
217
+ - type: cosine_precision@10
218
+ value: 0.1
219
+ name: Cosine Precision@10
220
+ - type: cosine_recall@1
221
+ value: 1.0
222
+ name: Cosine Recall@1
223
+ - type: cosine_recall@3
224
+ value: 1.0
225
+ name: Cosine Recall@3
226
+ - type: cosine_recall@5
227
+ value: 1.0
228
+ name: Cosine Recall@5
229
+ - type: cosine_recall@10
230
+ value: 1.0
231
+ name: Cosine Recall@10
232
+ - type: cosine_ndcg@10
233
+ value: 1.0
234
+ name: Cosine Ndcg@10
235
+ - type: cosine_mrr@10
236
+ value: 1.0
237
+ name: Cosine Mrr@10
238
+ - type: cosine_map@100
239
+ value: 1.0
240
+ name: Cosine Map@100
241
+ ---
242
+
243
+ # SentenceTransformer based on Snowflake/snowflake-arctic-embed-l
244
+
245
+ This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [Snowflake/snowflake-arctic-embed-l](https://huggingface.co/Snowflake/snowflake-arctic-embed-l). It maps sentences & paragraphs to a 1024-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
246
+
247
+ ## Model Details
248
+
249
+ ### Model Description
250
+ - **Model Type:** Sentence Transformer
251
+ - **Base model:** [Snowflake/snowflake-arctic-embed-l](https://huggingface.co/Snowflake/snowflake-arctic-embed-l) <!-- at revision d8fb21ca8d905d2832ee8b96c894d3298964346b -->
252
+ - **Maximum Sequence Length:** 512 tokens
253
+ - **Output Dimensionality:** 1024 dimensions
254
+ - **Similarity Function:** Cosine Similarity
255
+ <!-- - **Training Dataset:** Unknown -->
256
+ <!-- - **Language:** Unknown -->
257
+ <!-- - **License:** Unknown -->
258
+
259
+ ### Model Sources
260
+
261
+ - **Documentation:** [Sentence Transformers Documentation](https://sbert.net)
262
+ - **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers)
263
+ - **Hugging Face:** [Sentence Transformers on Hugging Face](https://huggingface.co/models?library=sentence-transformers)
264
+
265
+ ### Full Model Architecture
266
+
267
+ ```
268
+ SentenceTransformer(
269
+ (0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: BertModel
270
+ (1): Pooling({'word_embedding_dimension': 1024, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
271
+ (2): Normalize()
272
+ )
273
+ ```
274
+
275
+ ## Usage
276
+
277
+ ### Direct Usage (Sentence Transformers)
278
+
279
+ First install the Sentence Transformers library:
280
+
281
+ ```bash
282
+ pip install -U sentence-transformers
283
+ ```
284
+
285
+ Then you can load this model and run inference.
286
+ ```python
287
+ from sentence_transformers import SentenceTransformer
288
+
289
+ # Download from the 🤗 Hub
290
+ model = SentenceTransformer("Rsr2425/simplify-ft-arctic-embed-l")
291
+ # Run inference
292
+ sentences = [
293
+ '1. How do chat models utilize the state of the graph to recover sources for generated answers? \n2. What is the significance of the "context" field in the state when returning sources?',
294
+ 'Returning sources\u200b\nNote that by storing the retrieved context in the state of the graph, we recover sources for the model\'s generated answer in the "context" field of the state. See this guide on returning sources for more detail.\nGo deeper\u200b\nChat models take in a sequence of messages and return a message.',
295
+ 'Docs: Detailed documentation on how to use embeddings.\nIntegrations: 30+ integrations to choose from.\nInterface: API reference for the base interface.\n\nVectorStore: Wrapper around a vector database, used for storing and\nquerying embeddings.\n\nDocs: Detailed documentation on how to use vector stores.\nIntegrations: 40+ integrations to choose from.\nInterface: API reference for the base interface.',
296
+ ]
297
+ embeddings = model.encode(sentences)
298
+ print(embeddings.shape)
299
+ # [3, 1024]
300
+
301
+ # Get the similarity scores for the embeddings
302
+ similarities = model.similarity(embeddings, embeddings)
303
+ print(similarities.shape)
304
+ # [3, 3]
305
+ ```
306
+
307
+ <!--
308
+ ### Direct Usage (Transformers)
309
+
310
+ <details><summary>Click to see the direct usage in Transformers</summary>
311
+
312
+ </details>
313
+ -->
314
+
315
+ <!--
316
+ ### Downstream Usage (Sentence Transformers)
317
+
318
+ You can finetune this model on your own dataset.
319
+
320
+ <details><summary>Click to expand</summary>
321
+
322
+ </details>
323
+ -->
324
+
325
+ <!--
326
+ ### Out-of-Scope Use
327
+
328
+ *List how the model may foreseeably be misused and address what users ought not to do with the model.*
329
+ -->
330
+
331
+ ## Evaluation
332
+
333
+ ### Metrics
334
+
335
+ #### Information Retrieval
336
+
337
+ * Evaluated with [<code>InformationRetrievalEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator)
338
+
339
+ | Metric | Value |
340
+ |:--------------------|:--------|
341
+ | cosine_accuracy@1 | 1.0 |
342
+ | cosine_accuracy@3 | 1.0 |
343
+ | cosine_accuracy@5 | 1.0 |
344
+ | cosine_accuracy@10 | 1.0 |
345
+ | cosine_precision@1 | 1.0 |
346
+ | cosine_precision@3 | 0.3333 |
347
+ | cosine_precision@5 | 0.2 |
348
+ | cosine_precision@10 | 0.1 |
349
+ | cosine_recall@1 | 1.0 |
350
+ | cosine_recall@3 | 1.0 |
351
+ | cosine_recall@5 | 1.0 |
352
+ | cosine_recall@10 | 1.0 |
353
+ | **cosine_ndcg@10** | **1.0** |
354
+ | cosine_mrr@10 | 1.0 |
355
+ | cosine_map@100 | 1.0 |
356
+
357
+ <!--
358
+ ## Bias, Risks and Limitations
359
+
360
+ *What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
361
+ -->
362
+
363
+ <!--
364
+ ### Recommendations
365
+
366
+ *What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
367
+ -->
368
+
369
+ ## Training Details
370
+
371
+ ### Training Dataset
372
+
373
+ #### Unnamed Dataset
374
+
375
+ * Size: 64 training samples
376
+ * Columns: <code>sentence_0</code> and <code>sentence_1</code>
377
+ * Approximate statistics based on the first 64 samples:
378
+ | | sentence_0 | sentence_1 |
379
+ |:--------|:-----------------------------------------------------------------------------------|:-------------------------------------------------------------------------------------|
380
+ | type | string | string |
381
+ | details | <ul><li>min: 23 tokens</li><li>mean: 37.42 tokens</li><li>max: 49 tokens</li></ul> | <ul><li>min: 19 tokens</li><li>mean: 153.86 tokens</li><li>max: 286 tokens</li></ul> |
382
+ * Samples:
383
+ | sentence_0 | sentence_1 |
384
+ |:------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
385
+ | <code>1. How do chat models utilize the state of the graph to recover sources for generated answers? <br>2. What is the significance of the "context" field in the state when returning sources?</code> | <code>Returning sources​<br>Note that by storing the retrieved context in the state of the graph, we recover sources for the model's generated answer in the "context" field of the state. See this guide on returning sources for more detail.<br>Go deeper​<br>Chat models take in a sequence of messages and return a message.</code> |
386
+ | <code>1. What is the purpose of the indexing process in the data pipeline?<br>2. How does the retrieval and generation phase utilize the indexed data to respond to user queries?</code> | <code>Indexing: a pipeline for ingesting data from a source and indexing it. This usually happens offline.<br>Retrieval and generation: the actual RAG chain, which takes the user query at run time and retrieves the relevant data from the index, then passes that to the model.<br>Note: the indexing portion of this tutorial will largely follow the semantic search tutorial.<br>The most common full sequence from raw data to answer looks like:<br>Indexing​</code> |
387
+ | <code>1. What is task decomposition and how does it help in problem-solving?<br>2. Can you explain the methods used in task decomposition, such as chain of thought prompting and the tree of thoughts approach?</code> | <code>user's request in a straightforward manner. Then describe the task process and show your analysis and model inference results to the user in the first person. If inference results contain a file path, must tell the user the complete file path.")]Answer: Task decomposition is a technique used to break down complex tasks into smaller, manageable steps, allowing for more efficient problem-solving. This can be achieved through methods like chain of thought prompting or the tree of thoughts approach, which explores multiple reasoning possibilities at each step. It can be initiated through simple prompts, task-specific instructions, or human inputs.</code> |
388
+ * Loss: [<code>MatryoshkaLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#matryoshkaloss) with these parameters:
389
+ ```json
390
+ {
391
+ "loss": "MultipleNegativesRankingLoss",
392
+ "matryoshka_dims": [
393
+ 768,
394
+ 512,
395
+ 256,
396
+ 128,
397
+ 64
398
+ ],
399
+ "matryoshka_weights": [
400
+ 1,
401
+ 1,
402
+ 1,
403
+ 1,
404
+ 1
405
+ ],
406
+ "n_dims_per_step": -1
407
+ }
408
+ ```
409
+
410
+ ### Training Hyperparameters
411
+ #### Non-Default Hyperparameters
412
+
413
+ - `eval_strategy`: steps
414
+ - `per_device_train_batch_size`: 16
415
+ - `per_device_eval_batch_size`: 16
416
+ - `num_train_epochs`: 10
417
+ - `multi_dataset_batch_sampler`: round_robin
418
+
419
+ #### All Hyperparameters
420
+ <details><summary>Click to expand</summary>
421
+
422
+ - `overwrite_output_dir`: False
423
+ - `do_predict`: False
424
+ - `eval_strategy`: steps
425
+ - `prediction_loss_only`: True
426
+ - `per_device_train_batch_size`: 16
427
+ - `per_device_eval_batch_size`: 16
428
+ - `per_gpu_train_batch_size`: None
429
+ - `per_gpu_eval_batch_size`: None
430
+ - `gradient_accumulation_steps`: 1
431
+ - `eval_accumulation_steps`: None
432
+ - `torch_empty_cache_steps`: None
433
+ - `learning_rate`: 5e-05
434
+ - `weight_decay`: 0.0
435
+ - `adam_beta1`: 0.9
436
+ - `adam_beta2`: 0.999
437
+ - `adam_epsilon`: 1e-08
438
+ - `max_grad_norm`: 1
439
+ - `num_train_epochs`: 10
440
+ - `max_steps`: -1
441
+ - `lr_scheduler_type`: linear
442
+ - `lr_scheduler_kwargs`: {}
443
+ - `warmup_ratio`: 0.0
444
+ - `warmup_steps`: 0
445
+ - `log_level`: passive
446
+ - `log_level_replica`: warning
447
+ - `log_on_each_node`: True
448
+ - `logging_nan_inf_filter`: True
449
+ - `save_safetensors`: True
450
+ - `save_on_each_node`: False
451
+ - `save_only_model`: False
452
+ - `restore_callback_states_from_checkpoint`: False
453
+ - `no_cuda`: False
454
+ - `use_cpu`: False
455
+ - `use_mps_device`: False
456
+ - `seed`: 42
457
+ - `data_seed`: None
458
+ - `jit_mode_eval`: False
459
+ - `use_ipex`: False
460
+ - `bf16`: False
461
+ - `fp16`: False
462
+ - `fp16_opt_level`: O1
463
+ - `half_precision_backend`: auto
464
+ - `bf16_full_eval`: False
465
+ - `fp16_full_eval`: False
466
+ - `tf32`: None
467
+ - `local_rank`: 0
468
+ - `ddp_backend`: None
469
+ - `tpu_num_cores`: None
470
+ - `tpu_metrics_debug`: False
471
+ - `debug`: []
472
+ - `dataloader_drop_last`: False
473
+ - `dataloader_num_workers`: 0
474
+ - `dataloader_prefetch_factor`: None
475
+ - `past_index`: -1
476
+ - `disable_tqdm`: False
477
+ - `remove_unused_columns`: True
478
+ - `label_names`: None
479
+ - `load_best_model_at_end`: False
480
+ - `ignore_data_skip`: False
481
+ - `fsdp`: []
482
+ - `fsdp_min_num_params`: 0
483
+ - `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
484
+ - `fsdp_transformer_layer_cls_to_wrap`: None
485
+ - `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
486
+ - `deepspeed`: None
487
+ - `label_smoothing_factor`: 0.0
488
+ - `optim`: adamw_torch
489
+ - `optim_args`: None
490
+ - `adafactor`: False
491
+ - `group_by_length`: False
492
+ - `length_column_name`: length
493
+ - `ddp_find_unused_parameters`: None
494
+ - `ddp_bucket_cap_mb`: None
495
+ - `ddp_broadcast_buffers`: False
496
+ - `dataloader_pin_memory`: True
497
+ - `dataloader_persistent_workers`: False
498
+ - `skip_memory_metrics`: True
499
+ - `use_legacy_prediction_loop`: False
500
+ - `push_to_hub`: False
501
+ - `resume_from_checkpoint`: None
502
+ - `hub_model_id`: None
503
+ - `hub_strategy`: every_save
504
+ - `hub_private_repo`: None
505
+ - `hub_always_push`: False
506
+ - `gradient_checkpointing`: False
507
+ - `gradient_checkpointing_kwargs`: None
508
+ - `include_inputs_for_metrics`: False
509
+ - `include_for_metrics`: []
510
+ - `eval_do_concat_batches`: True
511
+ - `fp16_backend`: auto
512
+ - `push_to_hub_model_id`: None
513
+ - `push_to_hub_organization`: None
514
+ - `mp_parameters`:
515
+ - `auto_find_batch_size`: False
516
+ - `full_determinism`: False
517
+ - `torchdynamo`: None
518
+ - `ray_scope`: last
519
+ - `ddp_timeout`: 1800
520
+ - `torch_compile`: False
521
+ - `torch_compile_backend`: None
522
+ - `torch_compile_mode`: None
523
+ - `dispatch_batches`: None
524
+ - `split_batches`: None
525
+ - `include_tokens_per_second`: False
526
+ - `include_num_input_tokens_seen`: False
527
+ - `neftune_noise_alpha`: None
528
+ - `optim_target_modules`: None
529
+ - `batch_eval_metrics`: False
530
+ - `eval_on_start`: False
531
+ - `use_liger_kernel`: False
532
+ - `eval_use_gather_object`: False
533
+ - `average_tokens_across_devices`: False
534
+ - `prompts`: None
535
+ - `batch_sampler`: batch_sampler
536
+ - `multi_dataset_batch_sampler`: round_robin
537
+
538
+ </details>
539
+
540
+ ### Training Logs
541
+ | Epoch | Step | cosine_ndcg@10 |
542
+ |:-----:|:----:|:--------------:|
543
+ | 1.0 | 4 | 1.0 |
544
+ | 2.0 | 8 | 1.0 |
545
+ | 3.0 | 12 | 1.0 |
546
+ | 4.0 | 16 | 1.0 |
547
+ | 5.0 | 20 | 1.0 |
548
+ | 6.0 | 24 | 1.0 |
549
+ | 7.0 | 28 | 1.0 |
550
+ | 8.0 | 32 | 1.0 |
551
+ | 9.0 | 36 | 1.0 |
552
+ | 10.0 | 40 | 1.0 |
553
+
554
+
555
+ ### Framework Versions
556
+ - Python: 3.11.11
557
+ - Sentence Transformers: 3.4.1
558
+ - Transformers: 4.48.3
559
+ - PyTorch: 2.5.1+cu124
560
+ - Accelerate: 1.3.0
561
+ - Datasets: 3.2.0
562
+ - Tokenizers: 0.21.0
563
+
564
+ ## Citation
565
+
566
+ ### BibTeX
567
+
568
+ #### Sentence Transformers
569
+ ```bibtex
570
+ @inproceedings{reimers-2019-sentence-bert,
571
+ title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
572
+ author = "Reimers, Nils and Gurevych, Iryna",
573
+ booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
574
+ month = "11",
575
+ year = "2019",
576
+ publisher = "Association for Computational Linguistics",
577
+ url = "https://arxiv.org/abs/1908.10084",
578
+ }
579
+ ```
580
+
581
+ #### MatryoshkaLoss
582
+ ```bibtex
583
+ @misc{kusupati2024matryoshka,
584
+ title={Matryoshka Representation Learning},
585
+ author={Aditya Kusupati and Gantavya Bhatt and Aniket Rege and Matthew Wallingford and Aditya Sinha and Vivek Ramanujan and William Howard-Snyder and Kaifeng Chen and Sham Kakade and Prateek Jain and Ali Farhadi},
586
+ year={2024},
587
+ eprint={2205.13147},
588
+ archivePrefix={arXiv},
589
+ primaryClass={cs.LG}
590
+ }
591
+ ```
592
+
593
+ #### MultipleNegativesRankingLoss
594
+ ```bibtex
595
+ @misc{henderson2017efficient,
596
+ title={Efficient Natural Language Response Suggestion for Smart Reply},
597
+ author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
598
+ year={2017},
599
+ eprint={1705.00652},
600
+ archivePrefix={arXiv},
601
+ primaryClass={cs.CL}
602
+ }
603
+ ```
604
+
605
+ <!--
606
+ ## Glossary
607
+
608
+ *Clearly define terms in order to be accessible across audiences.*
609
+ -->
610
+
611
+ <!--
612
+ ## Model Card Authors
613
+
614
+ *Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
615
+ -->
616
+
617
+ <!--
618
+ ## Model Card Contact
619
+
620
+ *Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
621
+ -->
config.json ADDED
@@ -0,0 +1,25 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_name_or_path": "Snowflake/snowflake-arctic-embed-l",
3
+ "architectures": [
4
+ "BertModel"
5
+ ],
6
+ "attention_probs_dropout_prob": 0.1,
7
+ "classifier_dropout": null,
8
+ "hidden_act": "gelu",
9
+ "hidden_dropout_prob": 0.1,
10
+ "hidden_size": 1024,
11
+ "initializer_range": 0.02,
12
+ "intermediate_size": 4096,
13
+ "layer_norm_eps": 1e-12,
14
+ "max_position_embeddings": 512,
15
+ "model_type": "bert",
16
+ "num_attention_heads": 16,
17
+ "num_hidden_layers": 24,
18
+ "pad_token_id": 0,
19
+ "position_embedding_type": "absolute",
20
+ "torch_dtype": "float32",
21
+ "transformers_version": "4.48.3",
22
+ "type_vocab_size": 2,
23
+ "use_cache": true,
24
+ "vocab_size": 30522
25
+ }
config_sentence_transformers.json ADDED
@@ -0,0 +1,12 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "__version__": {
3
+ "sentence_transformers": "3.4.1",
4
+ "transformers": "4.48.3",
5
+ "pytorch": "2.5.1+cu124"
6
+ },
7
+ "prompts": {
8
+ "query": "Represent this sentence for searching relevant passages: "
9
+ },
10
+ "default_prompt_name": null,
11
+ "similarity_fn_name": "cosine"
12
+ }
model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:9f2736c228561fb945b8bf8dc40a726d7c423acd4feb8b8c2953fe85b783cbce
3
+ size 1336413848
modules.json ADDED
@@ -0,0 +1,20 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ [
2
+ {
3
+ "idx": 0,
4
+ "name": "0",
5
+ "path": "",
6
+ "type": "sentence_transformers.models.Transformer"
7
+ },
8
+ {
9
+ "idx": 1,
10
+ "name": "1",
11
+ "path": "1_Pooling",
12
+ "type": "sentence_transformers.models.Pooling"
13
+ },
14
+ {
15
+ "idx": 2,
16
+ "name": "2",
17
+ "path": "2_Normalize",
18
+ "type": "sentence_transformers.models.Normalize"
19
+ }
20
+ ]
sentence_bert_config.json ADDED
@@ -0,0 +1,4 @@
 
 
 
 
 
1
+ {
2
+ "max_seq_length": 512,
3
+ "do_lower_case": false
4
+ }
special_tokens_map.json ADDED
@@ -0,0 +1,37 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "cls_token": {
3
+ "content": "[CLS]",
4
+ "lstrip": false,
5
+ "normalized": false,
6
+ "rstrip": false,
7
+ "single_word": false
8
+ },
9
+ "mask_token": {
10
+ "content": "[MASK]",
11
+ "lstrip": false,
12
+ "normalized": false,
13
+ "rstrip": false,
14
+ "single_word": false
15
+ },
16
+ "pad_token": {
17
+ "content": "[PAD]",
18
+ "lstrip": false,
19
+ "normalized": false,
20
+ "rstrip": false,
21
+ "single_word": false
22
+ },
23
+ "sep_token": {
24
+ "content": "[SEP]",
25
+ "lstrip": false,
26
+ "normalized": false,
27
+ "rstrip": false,
28
+ "single_word": false
29
+ },
30
+ "unk_token": {
31
+ "content": "[UNK]",
32
+ "lstrip": false,
33
+ "normalized": false,
34
+ "rstrip": false,
35
+ "single_word": false
36
+ }
37
+ }
tokenizer.json ADDED
The diff for this file is too large to render. See raw diff
 
tokenizer_config.json ADDED
@@ -0,0 +1,63 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "added_tokens_decoder": {
3
+ "0": {
4
+ "content": "[PAD]",
5
+ "lstrip": false,
6
+ "normalized": false,
7
+ "rstrip": false,
8
+ "single_word": false,
9
+ "special": true
10
+ },
11
+ "100": {
12
+ "content": "[UNK]",
13
+ "lstrip": false,
14
+ "normalized": false,
15
+ "rstrip": false,
16
+ "single_word": false,
17
+ "special": true
18
+ },
19
+ "101": {
20
+ "content": "[CLS]",
21
+ "lstrip": false,
22
+ "normalized": false,
23
+ "rstrip": false,
24
+ "single_word": false,
25
+ "special": true
26
+ },
27
+ "102": {
28
+ "content": "[SEP]",
29
+ "lstrip": false,
30
+ "normalized": false,
31
+ "rstrip": false,
32
+ "single_word": false,
33
+ "special": true
34
+ },
35
+ "103": {
36
+ "content": "[MASK]",
37
+ "lstrip": false,
38
+ "normalized": false,
39
+ "rstrip": false,
40
+ "single_word": false,
41
+ "special": true
42
+ }
43
+ },
44
+ "clean_up_tokenization_spaces": true,
45
+ "cls_token": "[CLS]",
46
+ "do_lower_case": true,
47
+ "extra_special_tokens": {},
48
+ "mask_token": "[MASK]",
49
+ "max_length": 512,
50
+ "model_max_length": 512,
51
+ "pad_to_multiple_of": null,
52
+ "pad_token": "[PAD]",
53
+ "pad_token_type_id": 0,
54
+ "padding_side": "right",
55
+ "sep_token": "[SEP]",
56
+ "stride": 0,
57
+ "strip_accents": null,
58
+ "tokenize_chinese_chars": true,
59
+ "tokenizer_class": "BertTokenizer",
60
+ "truncation_side": "right",
61
+ "truncation_strategy": "longest_first",
62
+ "unk_token": "[UNK]"
63
+ }
vocab.txt ADDED
The diff for this file is too large to render. See raw diff