RosaMelo commited on
Commit
f391a3d
·
verified ·
1 Parent(s): 2097e27

Add new SentenceTransformer model

Browse files
1_Pooling/config.json ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "word_embedding_dimension": 768,
3
+ "pooling_mode_cls_token": false,
4
+ "pooling_mode_mean_tokens": true,
5
+ "pooling_mode_max_tokens": false,
6
+ "pooling_mode_mean_sqrt_len_tokens": false,
7
+ "pooling_mode_weightedmean_tokens": false,
8
+ "pooling_mode_lasttoken": false,
9
+ "include_prompt": true
10
+ }
README.md ADDED
@@ -0,0 +1,824 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language:
3
+ - en
4
+ license: apache-2.0
5
+ tags:
6
+ - sentence-transformers
7
+ - sentence-similarity
8
+ - feature-extraction
9
+ - generated_from_trainer
10
+ - dataset_size:79876
11
+ - loss:MultipleNegativesRankingLoss
12
+ base_model: Master-thesis-NAP/ModernBert-DAPT-math
13
+ widget:
14
+ - source_sentence: What is the error estimate for the difference between the exact
15
+ solution and the local oscillation decomposition (LOD) solution in terms of the
16
+ $L_0$ norm?
17
+ sentences:
18
+ - '\label{thm1}
19
+
20
+ Suppose $\kappa$ and $\bar a$ are as above. Then $|\Pcut(\bar a)| \leq 2^\kappa$.
21
+ Indeed if
22
+
23
+ $2^\kappa=\aleph_\alpha,$ then $|\Pcut(\bar a)| \leq |\alpha+1|^2$.'
24
+ - "\\cite{kyushu}\n For every discrete group $\\G$ and every 2-dimensional representation\
25
+ \ $\\varrho$ of $\\G$, $\\varrho-$equivariant functions for $\\G$ always exist."
26
+ - "\\label{Corollary}\n Let Assumptions~\\ref{assum_1} and~\\ref{assump2} be\
27
+ \ satisfied. Let $u$ be the solution of~\\eqref{WeakForm} and let $u_{H,k}$ be\
28
+ \ the LOD solution of~\\eqref{local_probelm }. Then we have \n \\begin{equation}\\\
29
+ label{L2Estimate}\n \\|u-I_Hu_{H,k}\\|_0\\lesssim \\|u-I_Hu\\|_0+\\|u-u_{H,k}\\\
30
+ |_0 +H|u-u_{H,k}|_1.\n \\end{equation}\n %\\[\\|u-I_Hu_{H,k}\\|_0\\lesssim\
31
+ \ H |u|_1 +|u-u_{H,k}|_1.\\]"
32
+ - source_sentence: Does the theorem imply that the rate of convergence of the sequence
33
+ $T_{m,j}(E)$ to $T_{m+k_n,j+k_n}(E)$ is exponential in the distance between $m$
34
+ and $j$, and that this rate is bounded by a constant $C$ times an exponential
35
+ decay factor involving the parameter $\gamma$?
36
+ sentences:
37
+ - "\\label{thm:weibull}\nSuppose random variable $X$ follows Weibull distribution,\
38
+ \ and $E(X^i)$ denotes the $i$-th moment of $X$. Then the random variable $X$\
39
+ \ satisfy the following inequality: \n\\begin{equation}\\label{eq:moments}\n \
40
+ \ E(X^n)^{\\frac{1}{n}} \\geq E(X^m)^{\\frac{1}{m}},\n\\end{equation}\nwhere\
41
+ \ $n > m$."
42
+ - "\\label{lem1}\n\t\tFor all $m,j\\in\\Z$,  we have\n\t\t\\begin{equation*}\n\t\
43
+ \t|| T_{m,j} (E)-T_{m+k_n,j+k_n}(E)||\\leq C e^{-\\gamma k_n} e^{(\\mathcal\
44
+ \ L(E)+\\varepsilon) |m-j|}. \n\t\t\\end{equation*}"
45
+ - If the problem \eqref{eq:Model-based_Program} is convex, then under the primal-dual
46
+ dynamics \eqref{eq:PDD}-\eqref{eq:AlgebraicConstruction}, the system \eqref{eq:Input-OutputMap}
47
+ asymptotically converges to a steady state that is the optimal solution of \eqref{eq:Model-based_Program}.
48
+ - source_sentence: What is the rate of convergence for the total error in the given
49
+ problem, assuming the conditions in Theorem~\ref{convergence-rates} are met?
50
+ sentences:
51
+ - "\\label{convergence-rates}\nUnder the assumptions of Theorem~\\ref{well-posedness}.\
52
+ \ Given $(\\bu,{p},\\bzeta,\\varphi)\\in (\\bH^{s_1+1}(\\Omega)\\cap \\bV_1)\\\
53
+ times (\\text{H}^{s_1}(\\Omega)\\cap Q_{b_1}) \\times (\\bH^{s_2}\\cap \\bV_2)\
54
+ \ \\times (\\text{H}^{s_2}\\cap Q_{b_2})$, $(\\bu_h,{p}_h,\\bzeta_h,\\varphi_h)\\\
55
+ in \\bV_1^{h,k_1}\\times Q_1^{h,k_1}\\times \\bV_2^{h,k_2}\\times Q_2^{h,k_2}$\
56
+ \ be the respective solutions of the continuous and discrete problems, with the\
57
+ \ data satisfying $\\fb\\in \\bH^{s_1-1}\\cap \\bQ_{b_1}$ and $g\\in H^{s_2}(\\\
58
+ Omega)\\cap Q_{b_2}$. If $\\overline{C}_1 \\sqrt{M} L_\\ell + \\overline{C}_2^2\
59
+ \ \\sqrt{M^3} L_\\bbM\\sqrt{2\\mu} (\\norm{\\varphi_D}_{1/2,\\Gamma_D} + \\\
60
+ norm{g}_{0,\\Omega}) < 1/2.$ Then, the total error $\\overline{\\textnormal{e}}_h:=\\\
61
+ norm{(\\bu-\\bu_h,{p}-{p}_h, \\bzeta-\\bzeta_h,\\varphi-\\varphi_h)}_{\\bV_1\\\
62
+ times Q_{1} \\times \\bV_2\\times Q_2}$ decays with the following rate for $s:=\
63
+ \ \\min \\left\\{s_1,s_2\\right\\}$\n \\begin{align*}\\label{convergence-rate}\n\
64
+ \ \\overline{\\textnormal{e}}_h &\\lesssim h^{ s} (|\\fb|_{s_1-1,\\bQ_{b_1}}\
65
+ \ + |\\bu|_{s_1+1,\\bV_1} + |{p}|_{s_1,Q_{b_1}} + |g|_{s_2,Q_{b_2}} + |\\bzeta|_{s_2,\\\
66
+ bV_2}+|\\varphi|_{s_2,Q_{b_2}}).\n \\end{align*}"
67
+ - "\\label{thm}\nFor vector linear secure aggregation defined above, the optimal\
68
+ \ total key rate is \n\\begin{eqnarray}\n R_{Z_{\\Sigma}}^* %= \\left\\{R_{Z_{\\\
69
+ Sigma}}: R_{Z_{\\Sigma}} \\geq \n = \\mbox{rank} \\left( \\left[ \\mathbf{F}\
70
+ \ ; \\mathbf{G} \\right] \\right)\n - \\mbox{rank} \\left( \\mathbf{F} \\\
71
+ right) = \\mbox{rank}({\\bf G} | {\\bf F}).\n %\\right\\}.\n% \\\\ \\\
72
+ mbox{rank}\n\\end{eqnarray}"
73
+ - "The process $Y(t)$, $t\\geq 0,$ is called Markov branching process with\r\nnon-homogeneous\
74
+ \ Poisson immigration (MBPNPI)."
75
+ - source_sentence: Is the local time of the horizontal component of the Peano curve
76
+ ever greater than 1?
77
+ sentences:
78
+ - "[Divergence Theorem or Gauss-Green Theorem for Surfaces in $\\R^3$]\n\t\\label{thm:surface_int}\n\
79
+ \t Let $\\Sigma \\subset \\Omega\\subseteq\\R^3$ be a bounded smooth surface.\n\
80
+ \t Further, $\\bb a:\\Sigma\\to\\R^3$ is a continuously differentiable\
81
+ \ vector field that is either defined on the\n\t\t\t\t\tboundary $\\partial\\\
82
+ Sigma$ or has a bounded continuous extension to this boundary.\n\t Like\
83
+ \ in \\eqref{eq:decomp} it may be decomposed into tangential and normal components\n\
84
+ \t\t\t\t\tas follows $\\bb a = \\bb a^\\shortparallel + a_\\nu\\bs\\nu_\\Sigma$.\
85
+ \ By $\\dd l$ we denote the line element on \n\t\t\t\t\tthe curve $\\partial \\\
86
+ Sigma$. We assume that the curve is continuous and consists of finitely many\n\
87
+ \t\t\t\t\tsmooth pieces.\n\t Then the following divergence formula for\
88
+ \ surface integrals holds\n\t %\n\t \\begin{align}\n\t \
89
+ \ %\n\t \\int\\limits_\\Sigma \\left[\\nabla_\\Sigma\\cdot\\bb a^\\\
90
+ shortparallel\\right](\\x)\\;\\dd S\n\t\t\t\t\t\t\t= \\int\\limits_{\\partial\\\
91
+ Sigma} \\left[\\bb a\\cdot\\bs\\nu_{\\partial\\Sigma}\\right](\\x)\\,\\dd l .\n\
92
+ \t \\label{eq:surface_div}\n\t %\n\t \\end{align}\n\
93
+ \t\t\t\t\t%\n\t\t\t\t\tFrom this we obtain the formula\n\t\t\t\t\t%\n\t \
94
+ \ \\begin{align}\n\t %\n\t \\int\\limits_\\Sigma \\left[\\\
95
+ nabla_\\Sigma\\cdot\\bb a\\right](\\x)\\;\\dd S\n\t\t\t\t\t\t\t= \\int\\limits_{\\\
96
+ partial\\Sigma} \\left[\\bb a\\cdot\\bs\\nu_{\\partial\\Sigma}\\right](\\x)\\\
97
+ ,\\dd l \n\t\t\t\t\t\t\t-\\int\\limits_\\Sigma\\left[ 2\\kappa_Ma_\\nu\\right](\\\
98
+ x)\\;\\dd S.\n\t \\label{eq:surface_div_2}\n\t %\n\t \
99
+ \ \\end{align}\n\t %"
100
+ - There exists local time of the horizontal component $x$ of the Peano curve. Moreover,
101
+ this local time attains values no greater than $1$.
102
+ - "[Werner-Young's inequality]\\label{Young op-op}\nSuppose $S\\in \\cS^p$ and $T\\\
103
+ in \\cS^q$ with $1+r^{-1}=p^{-1}+q^{-1}$.\nThen $S\\star T\\in L^r(\\R^{2d})$\
104
+ \ and\n\\begin{align*}\n \\|S\\star T\\|_{L^{r}}\\leq \\|S\\|_{\\cS^p}\\|T\\\
105
+ |_{\\cS^q}.\n\\end{align*}"
106
+ - source_sentence: What is the meaning of the identity containment $1_x:x\to x$ in
107
+ the context of the bond system?
108
+ sentences:
109
+ - "\\label{lem:opt_lin}\nConsider the optimization problem\n\\begin{equation}\\\
110
+ label{eq:max_tr_lem}\n\\begin{aligned}\n \\max_{\\bs{U}}&\\;\\; \\Re\\{\\mrm{tr}(\\\
111
+ bs{U}^\\mrm{H}\\bs{B}) \\}\\\\\n \\mrm{s.t. \\;\\;}& \\bs{U}\\in \\mathcal{U}(N),\n\
112
+ \\end{aligned}\n\\end{equation}\nwhere $\\bs{B}$ may be an arbitrary $N\\times\
113
+ \ N$ matrix with singular value decomposition (SVD) $\\bs{B}=\\bs{U}_{\\bs{B}}\\\
114
+ bs{S}_{\\bs{B}}\\bs{V}_{\\bs{B}}^\\mrm{H}$. The solution to \\eqref{eq:max_tr_lem}\
115
+ \ is given by\n\\begin{equation}\\label{eq:sol_max}\n \\bs{U}_\\mrm{opt} =\
116
+ \ \\bs{U}_{\\bs{B}}^\\mrm{H}\\bs{V}_{\\bs{B}}.\n\\end{equation}\n\\begin{skproof}\n\
117
+ \ A formal proof, which may be included in the extended version, can be obtained\
118
+ \ by defining the Riemannian gradient over the unitary group and finding the stationary\
119
+ \ point where it vanishes. However, an intuitive argument is that the solution\
120
+ \ to \\eqref{eq:max_tr_lem} is obtained by positively combining the singular values\
121
+ \ of $\\bs{B}$, leading to \\eqref{eq:sol_max}.\n\\end{skproof}"
122
+ - '\label{AM_BA_lem1}
123
+
124
+ Let $$\Omega =\left\{a={{\left(k_1x_1+k_2,\dots,k_1x_n+k_2\right)}}\mid k_1, k_2\in
125
+ \mathbb{R}\right\} .$$ Then ${\displaystyle\underset{a\in \Omega}{\operatorname{argmin}}
126
+ {J_{\alpha }}(a)=\overline{a}\ },$ where $\overline{a}=\left(\overline{a}_1,\dots,\overline{a}_n\right)$,
127
+ $$\overline{a}_i=\frac{1}{n}\sum^n_{j =1}{y_j},\quad\forall i=1,\dots,n.$$ In
128
+ other words, on the class of lines $J_{\alpha }\left(a\right)$ reaches a minimum
129
+ on a straight line parallel to the $Ox$ axis. So, this is the average line for
130
+ the ordinates of all points of set $X$.'
131
+ - "A \\emph{bond system} is a tuple $(B,C,s,t,1,\\cdot)$, where $B$ is a set of\
132
+ \ \\emph{bonds}, $C$ is a set of \\emph{content} relations, and $s,t:C\\to B$\
133
+ \ are \\emph{source} and \\emph{target} functions. For $c\\in C$ with $s(c)=x$\
134
+ \ and $t(c)=y$, we write $x\\xrightarrow{c}y$ or $c:x\\to y$, indicating that\
135
+ \ $x$ \\emph{contains} $y$. Each bond $x\\in B$ has an \\emph{identity} containment\
136
+ \ $1_x:x\\to x$, meaning every bond trivially contains itself. For $c:x\\to y$\
137
+ \ and $c':y\\to z$, their composition is $cc':x\\to z$. These data must satisfy:\n\
138
+ \ \\begin{enumerate}\n \\item Identity laws: For each $c:x\\to y$, $1_x\
139
+ \ c= c=c1_y$\n \\item Associativity: For $c:x\\to y$, $c':y\\to z$, $c'':z\\\
140
+ to w$, $c(c'c'')=(cc')c''$\n \\item Anti-symmetry: For $c:x\\to y$ and\
141
+ \ $c':y\\to x$, $x=y$\n \\item Left cancellation: For $c,c':x\\to y$ and\
142
+ \ $c'':y\\to z$, if $cc''=c'c''$, then $c=c'$\n \\end{enumerate}"
143
+ pipeline_tag: sentence-similarity
144
+ library_name: sentence-transformers
145
+ metrics:
146
+ - cosine_accuracy@1
147
+ - cosine_accuracy@3
148
+ - cosine_accuracy@5
149
+ - cosine_accuracy@10
150
+ - cosine_precision@1
151
+ - cosine_precision@3
152
+ - cosine_precision@5
153
+ - cosine_precision@10
154
+ - cosine_recall@1
155
+ - cosine_recall@3
156
+ - cosine_recall@5
157
+ - cosine_recall@10
158
+ - cosine_ndcg@10
159
+ - cosine_mrr@10
160
+ - cosine_map@100
161
+ model-index:
162
+ - name: ModernBERT DAPT Embed DAPT Math
163
+ results:
164
+ - task:
165
+ type: information-retrieval
166
+ name: Information Retrieval
167
+ dataset:
168
+ name: TESTING
169
+ type: TESTING
170
+ metrics:
171
+ - type: cosine_accuracy@1
172
+ value: 0.868020304568528
173
+ name: Cosine Accuracy@1
174
+ - type: cosine_accuracy@3
175
+ value: 0.9183202584217812
176
+ name: Cosine Accuracy@3
177
+ - type: cosine_accuracy@5
178
+ value: 0.9325103830179973
179
+ name: Cosine Accuracy@5
180
+ - type: cosine_accuracy@10
181
+ value: 0.9495846792801107
182
+ name: Cosine Accuracy@10
183
+ - type: cosine_precision@1
184
+ value: 0.868020304568528
185
+ name: Cosine Precision@1
186
+ - type: cosine_precision@3
187
+ value: 0.6118674050146131
188
+ name: Cosine Precision@3
189
+ - type: cosine_precision@5
190
+ value: 0.49353945546838945
191
+ name: Cosine Precision@5
192
+ - type: cosine_precision@10
193
+ value: 0.34758883248730965
194
+ name: Cosine Precision@10
195
+ - type: cosine_recall@1
196
+ value: 0.04186710795480722
197
+ name: Cosine Recall@1
198
+ - type: cosine_recall@3
199
+ value: 0.08315252408701693
200
+ name: Cosine Recall@3
201
+ - type: cosine_recall@5
202
+ value: 0.1073909448198794
203
+ name: Cosine Recall@5
204
+ - type: cosine_recall@10
205
+ value: 0.14207392775097807
206
+ name: Cosine Recall@10
207
+ - type: cosine_ndcg@10
208
+ value: 0.4493273991613623
209
+ name: Cosine Ndcg@10
210
+ - type: cosine_mrr@10
211
+ value: 0.8963655316764447
212
+ name: Cosine Mrr@10
213
+ - type: cosine_map@100
214
+ value: 0.16376932233660765
215
+ name: Cosine Map@100
216
+ ---
217
+
218
+ # ModernBERT DAPT Embed DAPT Math
219
+
220
+ This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [Master-thesis-NAP/ModernBert-DAPT-math](https://huggingface.co/Master-thesis-NAP/ModernBert-DAPT-math). It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
221
+
222
+ ## Model Details
223
+
224
+ ### Model Description
225
+ - **Model Type:** Sentence Transformer
226
+ - **Base model:** [Master-thesis-NAP/ModernBert-DAPT-math](https://huggingface.co/Master-thesis-NAP/ModernBert-DAPT-math) <!-- at revision a30384f91d764c272e6b740c256d5581325ea4bb -->
227
+ - **Maximum Sequence Length:** 8192 tokens
228
+ - **Output Dimensionality:** 768 dimensions
229
+ - **Similarity Function:** Cosine Similarity
230
+ <!-- - **Training Dataset:** Unknown -->
231
+ - **Language:** en
232
+ - **License:** apache-2.0
233
+
234
+ ### Model Sources
235
+
236
+ - **Documentation:** [Sentence Transformers Documentation](https://sbert.net)
237
+ - **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers)
238
+ - **Hugging Face:** [Sentence Transformers on Hugging Face](https://huggingface.co/models?library=sentence-transformers)
239
+
240
+ ### Full Model Architecture
241
+
242
+ ```
243
+ SentenceTransformer(
244
+ (0): Transformer({'max_seq_length': 8192, 'do_lower_case': False}) with Transformer model: ModernBertModel
245
+ (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
246
+ (2): Normalize()
247
+ )
248
+ ```
249
+
250
+ ## Usage
251
+
252
+ ### Direct Usage (Sentence Transformers)
253
+
254
+ First install the Sentence Transformers library:
255
+
256
+ ```bash
257
+ pip install -U sentence-transformers
258
+ ```
259
+
260
+ Then you can load this model and run inference.
261
+ ```python
262
+ from sentence_transformers import SentenceTransformer
263
+
264
+ # Download from the 🤗 Hub
265
+ model = SentenceTransformer("Master-thesis-NAP/ModernBERT-DAPT-Embed-DAPT-Math-v2")
266
+ # Run inference
267
+ sentences = [
268
+ 'What is the meaning of the identity containment $1_x:x\\to x$ in the context of the bond system?',
269
+ "A \\emph{bond system} is a tuple $(B,C,s,t,1,\\cdot)$, where $B$ is a set of \\emph{bonds}, $C$ is a set of \\emph{content} relations, and $s,t:C\\to B$ are \\emph{source} and \\emph{target} functions. For $c\\in C$ with $s(c)=x$ and $t(c)=y$, we write $x\\xrightarrow{c}y$ or $c:x\\to y$, indicating that $x$ \\emph{contains} $y$. Each bond $x\\in B$ has an \\emph{identity} containment $1_x:x\\to x$, meaning every bond trivially contains itself. For $c:x\\to y$ and $c':y\\to z$, their composition is $cc':x\\to z$. These data must satisfy:\n \\begin{enumerate}\n \\item Identity laws: For each $c:x\\to y$, $1_x c= c=c1_y$\n \\item Associativity: For $c:x\\to y$, $c':y\\to z$, $c'':z\\to w$, $c(c'c'')=(cc')c''$\n \\item Anti-symmetry: For $c:x\\to y$ and $c':y\\to x$, $x=y$\n \\item Left cancellation: For $c,c':x\\to y$ and $c'':y\\to z$, if $cc''=c'c''$, then $c=c'$\n \\end{enumerate}",
270
+ '\\label{lem:opt_lin}\nConsider the optimization problem\n\\begin{equation}\\label{eq:max_tr_lem}\n\\begin{aligned}\n \\max_{\\bs{U}}&\\;\\; \\Re\\{\\mrm{tr}(\\bs{U}^\\mrm{H}\\bs{B}) \\}\\\\\n \\mrm{s.t. \\;\\;}& \\bs{U}\\in \\mathcal{U}(N),\n\\end{aligned}\n\\end{equation}\nwhere $\\bs{B}$ may be an arbitrary $N\\times N$ matrix with singular value decomposition (SVD) $\\bs{B}=\\bs{U}_{\\bs{B}}\\bs{S}_{\\bs{B}}\\bs{V}_{\\bs{B}}^\\mrm{H}$. The solution to \\eqref{eq:max_tr_lem} is given by\n\\begin{equation}\\label{eq:sol_max}\n \\bs{U}_\\mrm{opt} = \\bs{U}_{\\bs{B}}^\\mrm{H}\\bs{V}_{\\bs{B}}.\n\\end{equation}\n\\begin{skproof}\n A formal proof, which may be included in the extended version, can be obtained by defining the Riemannian gradient over the unitary group and finding the stationary point where it vanishes. However, an intuitive argument is that the solution to \\eqref{eq:max_tr_lem} is obtained by positively combining the singular values of $\\bs{B}$, leading to \\eqref{eq:sol_max}.\n\\end{skproof}',
271
+ ]
272
+ embeddings = model.encode(sentences)
273
+ print(embeddings.shape)
274
+ # [3, 768]
275
+
276
+ # Get the similarity scores for the embeddings
277
+ similarities = model.similarity(embeddings, embeddings)
278
+ print(similarities.shape)
279
+ # [3, 3]
280
+ ```
281
+
282
+ <!--
283
+ ### Direct Usage (Transformers)
284
+
285
+ <details><summary>Click to see the direct usage in Transformers</summary>
286
+
287
+ </details>
288
+ -->
289
+
290
+ <!--
291
+ ### Downstream Usage (Sentence Transformers)
292
+
293
+ You can finetune this model on your own dataset.
294
+
295
+ <details><summary>Click to expand</summary>
296
+
297
+ </details>
298
+ -->
299
+
300
+ <!--
301
+ ### Out-of-Scope Use
302
+
303
+ *List how the model may foreseeably be misused and address what users ought not to do with the model.*
304
+ -->
305
+
306
+ ## Evaluation
307
+
308
+ ### Metrics
309
+
310
+ #### Information Retrieval
311
+
312
+ * Dataset: `TESTING`
313
+ * Evaluated with [<code>InformationRetrievalEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator)
314
+
315
+ | Metric | Value |
316
+ |:--------------------|:-----------|
317
+ | cosine_accuracy@1 | 0.868 |
318
+ | cosine_accuracy@3 | 0.9183 |
319
+ | cosine_accuracy@5 | 0.9325 |
320
+ | cosine_accuracy@10 | 0.9496 |
321
+ | cosine_precision@1 | 0.868 |
322
+ | cosine_precision@3 | 0.6119 |
323
+ | cosine_precision@5 | 0.4935 |
324
+ | cosine_precision@10 | 0.3476 |
325
+ | cosine_recall@1 | 0.0419 |
326
+ | cosine_recall@3 | 0.0832 |
327
+ | cosine_recall@5 | 0.1074 |
328
+ | cosine_recall@10 | 0.1421 |
329
+ | **cosine_ndcg@10** | **0.4493** |
330
+ | cosine_mrr@10 | 0.8964 |
331
+ | cosine_map@100 | 0.1638 |
332
+
333
+ <!--
334
+ ## Bias, Risks and Limitations
335
+
336
+ *What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
337
+ -->
338
+
339
+ <!--
340
+ ### Recommendations
341
+
342
+ *What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
343
+ -->
344
+
345
+ ## Training Details
346
+
347
+ ### Training Dataset
348
+
349
+ #### Unnamed Dataset
350
+
351
+ * Size: 79,876 training samples
352
+ * Columns: <code>anchor</code> and <code>positive</code>
353
+ * Approximate statistics based on the first 1000 samples:
354
+ | | anchor | positive |
355
+ |:--------|:-----------------------------------------------------------------------------------|:------------------------------------------------------------------------------------|
356
+ | type | string | string |
357
+ | details | <ul><li>min: 9 tokens</li><li>mean: 38.48 tokens</li><li>max: 142 tokens</li></ul> | <ul><li>min: 5 tokens</li><li>mean: 210.43 tokens</li><li>max: 924 tokens</li></ul> |
358
+ * Samples:
359
+ | anchor | positive |
360
+ |:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
361
+ | <code>What is the limit of the proportion of 1's in the sequence $a_n$ as $n$ approaches infinity, given that $0 \leq 3g_n -2n \leq 4$?</code> | <code>Let $g_n$ be the number of $1$'s in the sequence $a_1 a_2 \cdots a_n$.<br>Then <br>\begin{equation}<br>0 \leq 3g_n -2n \leq 4<br>\label{star}<br>\end{equation}<br>for all $n$, and hence<br>$\lim_{n \rightarrow \infty} g_n/n = 2/3$.<br>\label{thm1}</code> |
362
+ | <code>Does the statement of \textbf{ThmConjAreTrue} imply that the maximum genus of a locally Cohen-Macaulay curve in $\mathbb{P}^3_{\mathbb{C}}$ of degree $d$ that does not lie on a surface of degree $s-1$ is always equal to $g(d,s)$?</code> | <code>\label{ThmConjAreTrue}<br>Conjectures \ref{Conj1} and \ref{Conj2} are true.<br>As a consequence, <br>if either $d=s \geq 1$ or $d \geq 2s+1 \geq 3$, <br>the maximum genus of a locally Cohen-Macaulay curve in $\mathbb{P}^3_{\mathbb{C}}$ of degree $d$ that does not lie on a surface of degree $s-1$ is equal to $g(d,s)$.</code> |
363
+ | <code>\\emph{Is the statement \emph{If $X$ is a compact Hausdorff space, then $X$ is normal}, proven in the first isomorphism theorem for topological groups, or is it a well-known result in topology?}</code> | <code>}<br>\newcommand{\ep}{</code> |
364
+ * Loss: [<code>MultipleNegativesRankingLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#multiplenegativesrankingloss) with these parameters:
365
+ ```json
366
+ {
367
+ "scale": 20.0,
368
+ "similarity_fct": "cos_sim"
369
+ }
370
+ ```
371
+
372
+ ### Training Hyperparameters
373
+ #### Non-Default Hyperparameters
374
+
375
+ - `eval_strategy`: epoch
376
+ - `per_device_train_batch_size`: 16
377
+ - `per_device_eval_batch_size`: 16
378
+ - `gradient_accumulation_steps`: 8
379
+ - `learning_rate`: 2e-05
380
+ - `num_train_epochs`: 4
381
+ - `lr_scheduler_type`: cosine
382
+ - `warmup_ratio`: 0.1
383
+ - `bf16`: True
384
+ - `tf32`: True
385
+ - `load_best_model_at_end`: True
386
+ - `optim`: adamw_torch_fused
387
+ - `batch_sampler`: no_duplicates
388
+
389
+ #### All Hyperparameters
390
+ <details><summary>Click to expand</summary>
391
+
392
+ - `overwrite_output_dir`: False
393
+ - `do_predict`: False
394
+ - `eval_strategy`: epoch
395
+ - `prediction_loss_only`: True
396
+ - `per_device_train_batch_size`: 16
397
+ - `per_device_eval_batch_size`: 16
398
+ - `per_gpu_train_batch_size`: None
399
+ - `per_gpu_eval_batch_size`: None
400
+ - `gradient_accumulation_steps`: 8
401
+ - `eval_accumulation_steps`: None
402
+ - `torch_empty_cache_steps`: None
403
+ - `learning_rate`: 2e-05
404
+ - `weight_decay`: 0.0
405
+ - `adam_beta1`: 0.9
406
+ - `adam_beta2`: 0.999
407
+ - `adam_epsilon`: 1e-08
408
+ - `max_grad_norm`: 1.0
409
+ - `num_train_epochs`: 4
410
+ - `max_steps`: -1
411
+ - `lr_scheduler_type`: cosine
412
+ - `lr_scheduler_kwargs`: {}
413
+ - `warmup_ratio`: 0.1
414
+ - `warmup_steps`: 0
415
+ - `log_level`: passive
416
+ - `log_level_replica`: warning
417
+ - `log_on_each_node`: True
418
+ - `logging_nan_inf_filter`: True
419
+ - `save_safetensors`: True
420
+ - `save_on_each_node`: False
421
+ - `save_only_model`: False
422
+ - `restore_callback_states_from_checkpoint`: False
423
+ - `no_cuda`: False
424
+ - `use_cpu`: False
425
+ - `use_mps_device`: False
426
+ - `seed`: 42
427
+ - `data_seed`: None
428
+ - `jit_mode_eval`: False
429
+ - `use_ipex`: False
430
+ - `bf16`: True
431
+ - `fp16`: False
432
+ - `fp16_opt_level`: O1
433
+ - `half_precision_backend`: auto
434
+ - `bf16_full_eval`: False
435
+ - `fp16_full_eval`: False
436
+ - `tf32`: True
437
+ - `local_rank`: 0
438
+ - `ddp_backend`: None
439
+ - `tpu_num_cores`: None
440
+ - `tpu_metrics_debug`: False
441
+ - `debug`: []
442
+ - `dataloader_drop_last`: False
443
+ - `dataloader_num_workers`: 0
444
+ - `dataloader_prefetch_factor`: None
445
+ - `past_index`: -1
446
+ - `disable_tqdm`: False
447
+ - `remove_unused_columns`: True
448
+ - `label_names`: None
449
+ - `load_best_model_at_end`: True
450
+ - `ignore_data_skip`: False
451
+ - `fsdp`: []
452
+ - `fsdp_min_num_params`: 0
453
+ - `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
454
+ - `tp_size`: 0
455
+ - `fsdp_transformer_layer_cls_to_wrap`: None
456
+ - `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
457
+ - `deepspeed`: None
458
+ - `label_smoothing_factor`: 0.0
459
+ - `optim`: adamw_torch_fused
460
+ - `optim_args`: None
461
+ - `adafactor`: False
462
+ - `group_by_length`: False
463
+ - `length_column_name`: length
464
+ - `ddp_find_unused_parameters`: None
465
+ - `ddp_bucket_cap_mb`: None
466
+ - `ddp_broadcast_buffers`: False
467
+ - `dataloader_pin_memory`: True
468
+ - `dataloader_persistent_workers`: False
469
+ - `skip_memory_metrics`: True
470
+ - `use_legacy_prediction_loop`: False
471
+ - `push_to_hub`: False
472
+ - `resume_from_checkpoint`: None
473
+ - `hub_model_id`: None
474
+ - `hub_strategy`: every_save
475
+ - `hub_private_repo`: None
476
+ - `hub_always_push`: False
477
+ - `gradient_checkpointing`: False
478
+ - `gradient_checkpointing_kwargs`: None
479
+ - `include_inputs_for_metrics`: False
480
+ - `include_for_metrics`: []
481
+ - `eval_do_concat_batches`: True
482
+ - `fp16_backend`: auto
483
+ - `push_to_hub_model_id`: None
484
+ - `push_to_hub_organization`: None
485
+ - `mp_parameters`:
486
+ - `auto_find_batch_size`: False
487
+ - `full_determinism`: False
488
+ - `torchdynamo`: None
489
+ - `ray_scope`: last
490
+ - `ddp_timeout`: 1800
491
+ - `torch_compile`: False
492
+ - `torch_compile_backend`: None
493
+ - `torch_compile_mode`: None
494
+ - `include_tokens_per_second`: False
495
+ - `include_num_input_tokens_seen`: False
496
+ - `neftune_noise_alpha`: None
497
+ - `optim_target_modules`: None
498
+ - `batch_eval_metrics`: False
499
+ - `eval_on_start`: False
500
+ - `use_liger_kernel`: False
501
+ - `eval_use_gather_object`: False
502
+ - `average_tokens_across_devices`: False
503
+ - `prompts`: None
504
+ - `batch_sampler`: no_duplicates
505
+ - `multi_dataset_batch_sampler`: proportional
506
+
507
+ </details>
508
+
509
+ ### Training Logs
510
+ <details><summary>Click to expand</summary>
511
+
512
+ | Epoch | Step | Training Loss | TESTING_cosine_ndcg@10 |
513
+ |:---------:|:--------:|:-------------:|:----------------------:|
514
+ | 0.0160 | 10 | 20.2777 | - |
515
+ | 0.0320 | 20 | 19.6613 | - |
516
+ | 0.0481 | 30 | 18.8588 | - |
517
+ | 0.0641 | 40 | 17.5525 | - |
518
+ | 0.0801 | 50 | 15.1065 | - |
519
+ | 0.0961 | 60 | 10.8128 | - |
520
+ | 0.1122 | 70 | 7.0698 | - |
521
+ | 0.1282 | 80 | 4.532 | - |
522
+ | 0.1442 | 90 | 3.5143 | - |
523
+ | 0.1602 | 100 | 2.3256 | - |
524
+ | 0.1762 | 110 | 1.4688 | - |
525
+ | 0.1923 | 120 | 1.0081 | - |
526
+ | 0.2083 | 130 | 0.949 | - |
527
+ | 0.2243 | 140 | 0.9709 | - |
528
+ | 0.2403 | 150 | 0.8403 | - |
529
+ | 0.2564 | 160 | 0.8749 | - |
530
+ | 0.2724 | 170 | 0.7955 | - |
531
+ | 0.2884 | 180 | 0.6587 | - |
532
+ | 0.3044 | 190 | 0.5832 | - |
533
+ | 0.3204 | 200 | 0.5376 | - |
534
+ | 0.3365 | 210 | 0.608 | - |
535
+ | 0.3525 | 220 | 0.4639 | - |
536
+ | 0.3685 | 230 | 0.6611 | - |
537
+ | 0.3845 | 240 | 0.5589 | - |
538
+ | 0.4006 | 250 | 0.5845 | - |
539
+ | 0.4166 | 260 | 0.4392 | - |
540
+ | 0.4326 | 270 | 0.4746 | - |
541
+ | 0.4486 | 280 | 0.4517 | - |
542
+ | 0.4647 | 290 | 0.4034 | - |
543
+ | 0.4807 | 300 | 0.4437 | - |
544
+ | 0.4967 | 310 | 0.4339 | - |
545
+ | 0.5127 | 320 | 0.4445 | - |
546
+ | 0.5287 | 330 | 0.3793 | - |
547
+ | 0.5448 | 340 | 0.3591 | - |
548
+ | 0.5608 | 350 | 0.4694 | - |
549
+ | 0.5768 | 360 | 0.4668 | - |
550
+ | 0.5928 | 370 | 0.4121 | - |
551
+ | 0.6089 | 380 | 0.4688 | - |
552
+ | 0.6249 | 390 | 0.387 | - |
553
+ | 0.6409 | 400 | 0.3748 | - |
554
+ | 0.6569 | 410 | 0.2997 | - |
555
+ | 0.6729 | 420 | 0.3756 | - |
556
+ | 0.6890 | 430 | 0.2993 | - |
557
+ | 0.7050 | 440 | 0.3514 | - |
558
+ | 0.7210 | 450 | 0.3646 | - |
559
+ | 0.7370 | 460 | 0.308 | - |
560
+ | 0.7531 | 470 | 0.3612 | - |
561
+ | 0.7691 | 480 | 0.2845 | - |
562
+ | 0.7851 | 490 | 0.2792 | - |
563
+ | 0.8011 | 500 | 0.2204 | - |
564
+ | 0.8171 | 510 | 0.2757 | - |
565
+ | 0.8332 | 520 | 0.2674 | - |
566
+ | 0.8492 | 530 | 0.3753 | - |
567
+ | 0.8652 | 540 | 0.3546 | - |
568
+ | 0.8812 | 550 | 0.3166 | - |
569
+ | 0.8973 | 560 | 0.2656 | - |
570
+ | 0.9133 | 570 | 0.3215 | - |
571
+ | 0.9293 | 580 | 0.2559 | - |
572
+ | 0.9453 | 590 | 0.4629 | - |
573
+ | 0.9613 | 600 | 0.31 | - |
574
+ | 0.9774 | 610 | 0.3601 | - |
575
+ | 0.9934 | 620 | 0.2391 | - |
576
+ | 1.0 | 625 | - | 0.4229 |
577
+ | 1.0080 | 630 | 0.2507 | - |
578
+ | 1.0240 | 640 | 0.1852 | - |
579
+ | 1.0401 | 650 | 0.1836 | - |
580
+ | 1.0561 | 660 | 0.1487 | - |
581
+ | 1.0721 | 670 | 0.1495 | - |
582
+ | 1.0881 | 680 | 0.1567 | - |
583
+ | 1.1041 | 690 | 0.1497 | - |
584
+ | 1.1202 | 700 | 0.1632 | - |
585
+ | 1.1362 | 710 | 0.1997 | - |
586
+ | 1.1522 | 720 | 0.182 | - |
587
+ | 1.1682 | 730 | 0.1884 | - |
588
+ | 1.1843 | 740 | 0.1766 | - |
589
+ | 1.2003 | 750 | 0.1477 | - |
590
+ | 1.2163 | 760 | 0.181 | - |
591
+ | 1.2323 | 770 | 0.092 | - |
592
+ | 1.2483 | 780 | 0.1506 | - |
593
+ | 1.2644 | 790 | 0.1305 | - |
594
+ | 1.2804 | 800 | 0.1533 | - |
595
+ | 1.2964 | 810 | 0.2306 | - |
596
+ | 1.3124 | 820 | 0.1861 | - |
597
+ | 1.3285 | 830 | 0.1157 | - |
598
+ | 1.3445 | 840 | 0.1054 | - |
599
+ | 1.3605 | 850 | 0.1696 | - |
600
+ | 1.3765 | 860 | 0.1327 | - |
601
+ | 1.3925 | 870 | 0.1485 | - |
602
+ | 1.4086 | 880 | 0.1395 | - |
603
+ | 1.4246 | 890 | 0.1021 | - |
604
+ | 1.4406 | 900 | 0.1283 | - |
605
+ | 1.4566 | 910 | 0.102 | - |
606
+ | 1.4727 | 920 | 0.1825 | - |
607
+ | 1.4887 | 930 | 0.1395 | - |
608
+ | 1.5047 | 940 | 0.157 | - |
609
+ | 1.5207 | 950 | 0.1444 | - |
610
+ | 1.5368 | 960 | 0.1317 | - |
611
+ | 1.5528 | 970 | 0.146 | - |
612
+ | 1.5688 | 980 | 0.1809 | - |
613
+ | 1.5848 | 990 | 0.1368 | - |
614
+ | 1.6008 | 1000 | 0.2036 | - |
615
+ | 1.6169 | 1010 | 0.1292 | - |
616
+ | 1.6329 | 1020 | 0.1306 | - |
617
+ | 1.6489 | 1030 | 0.1473 | - |
618
+ | 1.6649 | 1040 | 0.1595 | - |
619
+ | 1.6810 | 1050 | 0.1471 | - |
620
+ | 1.6970 | 1060 | 0.1869 | - |
621
+ | 1.7130 | 1070 | 0.1445 | - |
622
+ | 1.7290 | 1080 | 0.157 | - |
623
+ | 1.7450 | 1090 | 0.1382 | - |
624
+ | 1.7611 | 1100 | 0.157 | - |
625
+ | 1.7771 | 1110 | 0.1073 | - |
626
+ | 1.7931 | 1120 | 0.0864 | - |
627
+ | 1.8091 | 1130 | 0.1312 | - |
628
+ | 1.8252 | 1140 | 0.1644 | - |
629
+ | 1.8412 | 1150 | 0.1366 | - |
630
+ | 1.8572 | 1160 | 0.1257 | - |
631
+ | 1.8732 | 1170 | 0.127 | - |
632
+ | 1.8892 | 1180 | 0.1494 | - |
633
+ | 1.9053 | 1190 | 0.1516 | - |
634
+ | 1.9213 | 1200 | 0.1709 | - |
635
+ | 1.9373 | 1210 | 0.1717 | - |
636
+ | 1.9533 | 1220 | 0.1044 | - |
637
+ | 1.9694 | 1230 | 0.1551 | - |
638
+ | 1.9854 | 1240 | 0.1303 | - |
639
+ | 2.0 | 1250 | 0.1081 | 0.4392 |
640
+ | 2.0160 | 1260 | 0.0572 | - |
641
+ | 2.0320 | 1270 | 0.0504 | - |
642
+ | 2.0481 | 1280 | 0.0535 | - |
643
+ | 2.0641 | 1290 | 0.0512 | - |
644
+ | 2.0801 | 1300 | 0.0539 | - |
645
+ | 2.0961 | 1310 | 0.0462 | - |
646
+ | 2.1122 | 1320 | 0.0611 | - |
647
+ | 2.1282 | 1330 | 0.0989 | - |
648
+ | 2.1442 | 1340 | 0.0462 | - |
649
+ | 2.1602 | 1350 | 0.061 | - |
650
+ | 2.1762 | 1360 | 0.0557 | - |
651
+ | 2.1923 | 1370 | 0.0622 | - |
652
+ | 2.2083 | 1380 | 0.0744 | - |
653
+ | 2.2243 | 1390 | 0.0531 | - |
654
+ | 2.2403 | 1400 | 0.0507 | - |
655
+ | 2.2564 | 1410 | 0.0533 | - |
656
+ | 2.2724 | 1420 | 0.0676 | - |
657
+ | 2.2884 | 1430 | 0.0706 | - |
658
+ | 2.3044 | 1440 | 0.0452 | - |
659
+ | 2.3204 | 1450 | 0.0415 | - |
660
+ | 2.3365 | 1460 | 0.0562 | - |
661
+ | 2.3525 | 1470 | 0.0487 | - |
662
+ | 2.3685 | 1480 | 0.0614 | - |
663
+ | 2.3845 | 1490 | 0.045 | - |
664
+ | 2.4006 | 1500 | 0.0529 | - |
665
+ | 2.4166 | 1510 | 0.048 | - |
666
+ | 2.4326 | 1520 | 0.059 | - |
667
+ | 2.4486 | 1530 | 0.0593 | - |
668
+ | 2.4647 | 1540 | 0.0631 | - |
669
+ | 2.4807 | 1550 | 0.0506 | - |
670
+ | 2.4967 | 1560 | 0.058 | - |
671
+ | 2.5127 | 1570 | 0.0896 | - |
672
+ | 2.5287 | 1580 | 0.0522 | - |
673
+ | 2.5448 | 1590 | 0.035 | - |
674
+ | 2.5608 | 1600 | 0.0677 | - |
675
+ | 2.5768 | 1610 | 0.0538 | - |
676
+ | 2.5928 | 1620 | 0.0485 | - |
677
+ | 2.6089 | 1630 | 0.0575 | - |
678
+ | 2.6249 | 1640 | 0.0571 | - |
679
+ | 2.6409 | 1650 | 0.0761 | - |
680
+ | 2.6569 | 1660 | 0.0582 | - |
681
+ | 2.6729 | 1670 | 0.0366 | - |
682
+ | 2.6890 | 1680 | 0.0445 | - |
683
+ | 2.7050 | 1690 | 0.0519 | - |
684
+ | 2.7210 | 1700 | 0.0506 | - |
685
+ | 2.7370 | 1710 | 0.0637 | - |
686
+ | 2.7531 | 1720 | 0.0618 | - |
687
+ | 2.7691 | 1730 | 0.0433 | - |
688
+ | 2.7851 | 1740 | 0.0503 | - |
689
+ | 2.8011 | 1750 | 0.0541 | - |
690
+ | 2.8171 | 1760 | 0.0443 | - |
691
+ | 2.8332 | 1770 | 0.0634 | - |
692
+ | 2.8492 | 1780 | 0.0586 | - |
693
+ | 2.8652 | 1790 | 0.0497 | - |
694
+ | 2.8812 | 1800 | 0.0444 | - |
695
+ | 2.8973 | 1810 | 0.0397 | - |
696
+ | 2.9133 | 1820 | 0.0483 | - |
697
+ | 2.9293 | 1830 | 0.0441 | - |
698
+ | 2.9453 | 1840 | 0.0758 | - |
699
+ | 2.9613 | 1850 | 0.0988 | - |
700
+ | 2.9774 | 1860 | 0.0566 | - |
701
+ | 2.9934 | 1870 | 0.0497 | - |
702
+ | 3.0 | 1875 | - | 0.4466 |
703
+ | 3.0080 | 1880 | 0.0388 | - |
704
+ | 3.0240 | 1890 | 0.0278 | - |
705
+ | 3.0401 | 1900 | 0.0231 | - |
706
+ | 3.0561 | 1910 | 0.0482 | - |
707
+ | 3.0721 | 1920 | 0.0416 | - |
708
+ | 3.0881 | 1930 | 0.052 | - |
709
+ | 3.1041 | 1940 | 0.0403 | - |
710
+ | 3.1202 | 1950 | 0.0384 | - |
711
+ | 3.1362 | 1960 | 0.0288 | - |
712
+ | 3.1522 | 1970 | 0.0368 | - |
713
+ | 3.1682 | 1980 | 0.0301 | - |
714
+ | 3.1843 | 1990 | 0.029 | - |
715
+ | 3.2003 | 2000 | 0.0332 | - |
716
+ | 3.2163 | 2010 | 0.0307 | - |
717
+ | 3.2323 | 2020 | 0.0502 | - |
718
+ | 3.2483 | 2030 | 0.0474 | - |
719
+ | 3.2644 | 2040 | 0.0383 | - |
720
+ | 3.2804 | 2050 | 0.0392 | - |
721
+ | 3.2964 | 2060 | 0.0308 | - |
722
+ | 3.3124 | 2070 | 0.0479 | - |
723
+ | 3.3285 | 2080 | 0.0448 | - |
724
+ | 3.3445 | 2090 | 0.0478 | - |
725
+ | 3.3605 | 2100 | 0.0249 | - |
726
+ | 3.3765 | 2110 | 0.03 | - |
727
+ | 3.3925 | 2120 | 0.0284 | - |
728
+ | 3.4086 | 2130 | 0.0323 | - |
729
+ | 3.4246 | 2140 | 0.0379 | - |
730
+ | 3.4406 | 2150 | 0.0221 | - |
731
+ | 3.4566 | 2160 | 0.0354 | - |
732
+ | 3.4727 | 2170 | 0.0332 | - |
733
+ | 3.4887 | 2180 | 0.0287 | - |
734
+ | 3.5047 | 2190 | 0.0382 | - |
735
+ | 3.5207 | 2200 | 0.0342 | - |
736
+ | 3.5368 | 2210 | 0.0381 | - |
737
+ | 3.5528 | 2220 | 0.056 | - |
738
+ | 3.5688 | 2230 | 0.0426 | - |
739
+ | 3.5848 | 2240 | 0.0465 | - |
740
+ | 3.6008 | 2250 | 0.0372 | - |
741
+ | 3.6169 | 2260 | 0.0345 | - |
742
+ | 3.6329 | 2270 | 0.0459 | - |
743
+ | 3.6489 | 2280 | 0.0368 | - |
744
+ | 3.6649 | 2290 | 0.0349 | - |
745
+ | 3.6810 | 2300 | 0.059 | - |
746
+ | 3.6970 | 2310 | 0.0275 | - |
747
+ | 3.7130 | 2320 | 0.0305 | - |
748
+ | 3.7290 | 2330 | 0.0406 | - |
749
+ | 3.7450 | 2340 | 0.0456 | - |
750
+ | 3.7611 | 2350 | 0.0311 | - |
751
+ | 3.7771 | 2360 | 0.0428 | - |
752
+ | 3.7931 | 2370 | 0.0308 | - |
753
+ | 3.8091 | 2380 | 0.0345 | - |
754
+ | 3.8252 | 2390 | 0.0378 | - |
755
+ | 3.8412 | 2400 | 0.0322 | - |
756
+ | 3.8572 | 2410 | 0.0236 | - |
757
+ | 3.8732 | 2420 | 0.0383 | - |
758
+ | 3.8892 | 2430 | 0.0295 | - |
759
+ | 3.9053 | 2440 | 0.0273 | - |
760
+ | 3.9213 | 2450 | 0.0286 | - |
761
+ | 3.9373 | 2460 | 0.0366 | - |
762
+ | 3.9533 | 2470 | 0.0285 | - |
763
+ | 3.9694 | 2480 | 0.0335 | - |
764
+ | 3.9854 | 2490 | 0.0278 | - |
765
+ | **3.995** | **2496** | **-** | **0.4493** |
766
+
767
+ * The bold row denotes the saved checkpoint.
768
+ </details>
769
+
770
+ ### Framework Versions
771
+ - Python: 3.11.12
772
+ - Sentence Transformers: 4.1.0
773
+ - Transformers: 4.51.3
774
+ - PyTorch: 2.6.0+cu124
775
+ - Accelerate: 1.6.0
776
+ - Datasets: 2.14.4
777
+ - Tokenizers: 0.21.1
778
+
779
+ ## Citation
780
+
781
+ ### BibTeX
782
+
783
+ #### Sentence Transformers
784
+ ```bibtex
785
+ @inproceedings{reimers-2019-sentence-bert,
786
+ title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
787
+ author = "Reimers, Nils and Gurevych, Iryna",
788
+ booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
789
+ month = "11",
790
+ year = "2019",
791
+ publisher = "Association for Computational Linguistics",
792
+ url = "https://arxiv.org/abs/1908.10084",
793
+ }
794
+ ```
795
+
796
+ #### MultipleNegativesRankingLoss
797
+ ```bibtex
798
+ @misc{henderson2017efficient,
799
+ title={Efficient Natural Language Response Suggestion for Smart Reply},
800
+ author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
801
+ year={2017},
802
+ eprint={1705.00652},
803
+ archivePrefix={arXiv},
804
+ primaryClass={cs.CL}
805
+ }
806
+ ```
807
+
808
+ <!--
809
+ ## Glossary
810
+
811
+ *Clearly define terms in order to be accessible across audiences.*
812
+ -->
813
+
814
+ <!--
815
+ ## Model Card Authors
816
+
817
+ *Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
818
+ -->
819
+
820
+ <!--
821
+ ## Model Card Contact
822
+
823
+ *Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
824
+ -->
config.json ADDED
@@ -0,0 +1,45 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "architectures": [
3
+ "ModernBertModel"
4
+ ],
5
+ "attention_bias": false,
6
+ "attention_dropout": 0.0,
7
+ "bos_token_id": 50281,
8
+ "classifier_activation": "gelu",
9
+ "classifier_bias": false,
10
+ "classifier_dropout": 0.0,
11
+ "classifier_pooling": "mean",
12
+ "cls_token_id": 50281,
13
+ "decoder_bias": true,
14
+ "deterministic_flash_attn": false,
15
+ "embedding_dropout": 0.0,
16
+ "eos_token_id": 50282,
17
+ "global_attn_every_n_layers": 3,
18
+ "global_rope_theta": 160000.0,
19
+ "gradient_checkpointing": false,
20
+ "hidden_activation": "gelu",
21
+ "hidden_size": 768,
22
+ "initializer_cutoff_factor": 2.0,
23
+ "initializer_range": 0.02,
24
+ "intermediate_size": 1152,
25
+ "layer_norm_eps": 1e-05,
26
+ "local_attention": 128,
27
+ "local_rope_theta": 10000.0,
28
+ "max_position_embeddings": 8192,
29
+ "mlp_bias": false,
30
+ "mlp_dropout": 0.0,
31
+ "model_type": "modernbert",
32
+ "norm_bias": false,
33
+ "norm_eps": 1e-05,
34
+ "num_attention_heads": 12,
35
+ "num_hidden_layers": 22,
36
+ "pad_token_id": 50283,
37
+ "position_embedding_type": "absolute",
38
+ "repad_logits_with_grad": false,
39
+ "sep_token_id": 50282,
40
+ "sparse_pred_ignore_index": -100,
41
+ "sparse_prediction": false,
42
+ "torch_dtype": "float32",
43
+ "transformers_version": "4.51.3",
44
+ "vocab_size": 50368
45
+ }
config_sentence_transformers.json ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "__version__": {
3
+ "sentence_transformers": "4.1.0",
4
+ "transformers": "4.51.3",
5
+ "pytorch": "2.6.0+cu124"
6
+ },
7
+ "prompts": {},
8
+ "default_prompt_name": null,
9
+ "similarity_fn_name": "cosine"
10
+ }
model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:523825a1c7be5af4e187a81b1449a03dc3005b69580a4a09f0c19efe484b6a66
3
+ size 596070136
modules.json ADDED
@@ -0,0 +1,20 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ [
2
+ {
3
+ "idx": 0,
4
+ "name": "0",
5
+ "path": "",
6
+ "type": "sentence_transformers.models.Transformer"
7
+ },
8
+ {
9
+ "idx": 1,
10
+ "name": "1",
11
+ "path": "1_Pooling",
12
+ "type": "sentence_transformers.models.Pooling"
13
+ },
14
+ {
15
+ "idx": 2,
16
+ "name": "2",
17
+ "path": "2_Normalize",
18
+ "type": "sentence_transformers.models.Normalize"
19
+ }
20
+ ]
sentence_bert_config.json ADDED
@@ -0,0 +1,4 @@
 
 
 
 
 
1
+ {
2
+ "max_seq_length": 8192,
3
+ "do_lower_case": false
4
+ }
special_tokens_map.json ADDED
@@ -0,0 +1,37 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "cls_token": {
3
+ "content": "[CLS]",
4
+ "lstrip": false,
5
+ "normalized": false,
6
+ "rstrip": false,
7
+ "single_word": false
8
+ },
9
+ "mask_token": {
10
+ "content": "[MASK]",
11
+ "lstrip": true,
12
+ "normalized": false,
13
+ "rstrip": false,
14
+ "single_word": false
15
+ },
16
+ "pad_token": {
17
+ "content": "[PAD]",
18
+ "lstrip": false,
19
+ "normalized": false,
20
+ "rstrip": false,
21
+ "single_word": false
22
+ },
23
+ "sep_token": {
24
+ "content": "[SEP]",
25
+ "lstrip": false,
26
+ "normalized": false,
27
+ "rstrip": false,
28
+ "single_word": false
29
+ },
30
+ "unk_token": {
31
+ "content": "[UNK]",
32
+ "lstrip": false,
33
+ "normalized": false,
34
+ "rstrip": false,
35
+ "single_word": false
36
+ }
37
+ }
tokenizer.json ADDED
The diff for this file is too large to render. See raw diff
 
tokenizer_config.json ADDED
@@ -0,0 +1,952 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "added_tokens_decoder": {
3
+ "0": {
4
+ "content": "|||IP_ADDRESS|||",
5
+ "lstrip": false,
6
+ "normalized": true,
7
+ "rstrip": false,
8
+ "single_word": false,
9
+ "special": false
10
+ },
11
+ "1": {
12
+ "content": "<|padding|>",
13
+ "lstrip": false,
14
+ "normalized": false,
15
+ "rstrip": false,
16
+ "single_word": false,
17
+ "special": true
18
+ },
19
+ "50254": {
20
+ "content": " ",
21
+ "lstrip": false,
22
+ "normalized": true,
23
+ "rstrip": false,
24
+ "single_word": false,
25
+ "special": false
26
+ },
27
+ "50255": {
28
+ "content": " ",
29
+ "lstrip": false,
30
+ "normalized": true,
31
+ "rstrip": false,
32
+ "single_word": false,
33
+ "special": false
34
+ },
35
+ "50256": {
36
+ "content": " ",
37
+ "lstrip": false,
38
+ "normalized": true,
39
+ "rstrip": false,
40
+ "single_word": false,
41
+ "special": false
42
+ },
43
+ "50257": {
44
+ "content": " ",
45
+ "lstrip": false,
46
+ "normalized": true,
47
+ "rstrip": false,
48
+ "single_word": false,
49
+ "special": false
50
+ },
51
+ "50258": {
52
+ "content": " ",
53
+ "lstrip": false,
54
+ "normalized": true,
55
+ "rstrip": false,
56
+ "single_word": false,
57
+ "special": false
58
+ },
59
+ "50259": {
60
+ "content": " ",
61
+ "lstrip": false,
62
+ "normalized": true,
63
+ "rstrip": false,
64
+ "single_word": false,
65
+ "special": false
66
+ },
67
+ "50260": {
68
+ "content": " ",
69
+ "lstrip": false,
70
+ "normalized": true,
71
+ "rstrip": false,
72
+ "single_word": false,
73
+ "special": false
74
+ },
75
+ "50261": {
76
+ "content": " ",
77
+ "lstrip": false,
78
+ "normalized": true,
79
+ "rstrip": false,
80
+ "single_word": false,
81
+ "special": false
82
+ },
83
+ "50262": {
84
+ "content": " ",
85
+ "lstrip": false,
86
+ "normalized": true,
87
+ "rstrip": false,
88
+ "single_word": false,
89
+ "special": false
90
+ },
91
+ "50263": {
92
+ "content": " ",
93
+ "lstrip": false,
94
+ "normalized": true,
95
+ "rstrip": false,
96
+ "single_word": false,
97
+ "special": false
98
+ },
99
+ "50264": {
100
+ "content": " ",
101
+ "lstrip": false,
102
+ "normalized": true,
103
+ "rstrip": false,
104
+ "single_word": false,
105
+ "special": false
106
+ },
107
+ "50265": {
108
+ "content": " ",
109
+ "lstrip": false,
110
+ "normalized": true,
111
+ "rstrip": false,
112
+ "single_word": false,
113
+ "special": false
114
+ },
115
+ "50266": {
116
+ "content": " ",
117
+ "lstrip": false,
118
+ "normalized": true,
119
+ "rstrip": false,
120
+ "single_word": false,
121
+ "special": false
122
+ },
123
+ "50267": {
124
+ "content": " ",
125
+ "lstrip": false,
126
+ "normalized": true,
127
+ "rstrip": false,
128
+ "single_word": false,
129
+ "special": false
130
+ },
131
+ "50268": {
132
+ "content": " ",
133
+ "lstrip": false,
134
+ "normalized": true,
135
+ "rstrip": false,
136
+ "single_word": false,
137
+ "special": false
138
+ },
139
+ "50269": {
140
+ "content": " ",
141
+ "lstrip": false,
142
+ "normalized": true,
143
+ "rstrip": false,
144
+ "single_word": false,
145
+ "special": false
146
+ },
147
+ "50270": {
148
+ "content": " ",
149
+ "lstrip": false,
150
+ "normalized": true,
151
+ "rstrip": false,
152
+ "single_word": false,
153
+ "special": false
154
+ },
155
+ "50271": {
156
+ "content": " ",
157
+ "lstrip": false,
158
+ "normalized": true,
159
+ "rstrip": false,
160
+ "single_word": false,
161
+ "special": false
162
+ },
163
+ "50272": {
164
+ "content": " ",
165
+ "lstrip": false,
166
+ "normalized": true,
167
+ "rstrip": false,
168
+ "single_word": false,
169
+ "special": false
170
+ },
171
+ "50273": {
172
+ "content": " ",
173
+ "lstrip": false,
174
+ "normalized": true,
175
+ "rstrip": false,
176
+ "single_word": false,
177
+ "special": false
178
+ },
179
+ "50274": {
180
+ "content": " ",
181
+ "lstrip": false,
182
+ "normalized": true,
183
+ "rstrip": false,
184
+ "single_word": false,
185
+ "special": false
186
+ },
187
+ "50275": {
188
+ "content": " ",
189
+ "lstrip": false,
190
+ "normalized": true,
191
+ "rstrip": false,
192
+ "single_word": false,
193
+ "special": false
194
+ },
195
+ "50276": {
196
+ "content": " ",
197
+ "lstrip": false,
198
+ "normalized": true,
199
+ "rstrip": false,
200
+ "single_word": false,
201
+ "special": false
202
+ },
203
+ "50277": {
204
+ "content": "|||EMAIL_ADDRESS|||",
205
+ "lstrip": false,
206
+ "normalized": true,
207
+ "rstrip": false,
208
+ "single_word": false,
209
+ "special": false
210
+ },
211
+ "50278": {
212
+ "content": "|||PHONE_NUMBER|||",
213
+ "lstrip": false,
214
+ "normalized": true,
215
+ "rstrip": false,
216
+ "single_word": false,
217
+ "special": false
218
+ },
219
+ "50279": {
220
+ "content": "<|endoftext|>",
221
+ "lstrip": false,
222
+ "normalized": false,
223
+ "rstrip": false,
224
+ "single_word": false,
225
+ "special": true
226
+ },
227
+ "50280": {
228
+ "content": "[UNK]",
229
+ "lstrip": false,
230
+ "normalized": false,
231
+ "rstrip": false,
232
+ "single_word": false,
233
+ "special": true
234
+ },
235
+ "50281": {
236
+ "content": "[CLS]",
237
+ "lstrip": false,
238
+ "normalized": false,
239
+ "rstrip": false,
240
+ "single_word": false,
241
+ "special": true
242
+ },
243
+ "50282": {
244
+ "content": "[SEP]",
245
+ "lstrip": false,
246
+ "normalized": false,
247
+ "rstrip": false,
248
+ "single_word": false,
249
+ "special": true
250
+ },
251
+ "50283": {
252
+ "content": "[PAD]",
253
+ "lstrip": false,
254
+ "normalized": false,
255
+ "rstrip": false,
256
+ "single_word": false,
257
+ "special": true
258
+ },
259
+ "50284": {
260
+ "content": "[MASK]",
261
+ "lstrip": true,
262
+ "normalized": false,
263
+ "rstrip": false,
264
+ "single_word": false,
265
+ "special": true
266
+ },
267
+ "50285": {
268
+ "content": "[unused0]",
269
+ "lstrip": false,
270
+ "normalized": true,
271
+ "rstrip": false,
272
+ "single_word": false,
273
+ "special": false
274
+ },
275
+ "50286": {
276
+ "content": "[unused1]",
277
+ "lstrip": false,
278
+ "normalized": true,
279
+ "rstrip": false,
280
+ "single_word": false,
281
+ "special": false
282
+ },
283
+ "50287": {
284
+ "content": "[unused2]",
285
+ "lstrip": false,
286
+ "normalized": true,
287
+ "rstrip": false,
288
+ "single_word": false,
289
+ "special": false
290
+ },
291
+ "50288": {
292
+ "content": "[unused3]",
293
+ "lstrip": false,
294
+ "normalized": true,
295
+ "rstrip": false,
296
+ "single_word": false,
297
+ "special": false
298
+ },
299
+ "50289": {
300
+ "content": "[unused4]",
301
+ "lstrip": false,
302
+ "normalized": true,
303
+ "rstrip": false,
304
+ "single_word": false,
305
+ "special": false
306
+ },
307
+ "50290": {
308
+ "content": "[unused5]",
309
+ "lstrip": false,
310
+ "normalized": true,
311
+ "rstrip": false,
312
+ "single_word": false,
313
+ "special": false
314
+ },
315
+ "50291": {
316
+ "content": "[unused6]",
317
+ "lstrip": false,
318
+ "normalized": true,
319
+ "rstrip": false,
320
+ "single_word": false,
321
+ "special": false
322
+ },
323
+ "50292": {
324
+ "content": "[unused7]",
325
+ "lstrip": false,
326
+ "normalized": true,
327
+ "rstrip": false,
328
+ "single_word": false,
329
+ "special": false
330
+ },
331
+ "50293": {
332
+ "content": "[unused8]",
333
+ "lstrip": false,
334
+ "normalized": true,
335
+ "rstrip": false,
336
+ "single_word": false,
337
+ "special": false
338
+ },
339
+ "50294": {
340
+ "content": "[unused9]",
341
+ "lstrip": false,
342
+ "normalized": true,
343
+ "rstrip": false,
344
+ "single_word": false,
345
+ "special": false
346
+ },
347
+ "50295": {
348
+ "content": "[unused10]",
349
+ "lstrip": false,
350
+ "normalized": true,
351
+ "rstrip": false,
352
+ "single_word": false,
353
+ "special": false
354
+ },
355
+ "50296": {
356
+ "content": "[unused11]",
357
+ "lstrip": false,
358
+ "normalized": true,
359
+ "rstrip": false,
360
+ "single_word": false,
361
+ "special": false
362
+ },
363
+ "50297": {
364
+ "content": "[unused12]",
365
+ "lstrip": false,
366
+ "normalized": true,
367
+ "rstrip": false,
368
+ "single_word": false,
369
+ "special": false
370
+ },
371
+ "50298": {
372
+ "content": "[unused13]",
373
+ "lstrip": false,
374
+ "normalized": true,
375
+ "rstrip": false,
376
+ "single_word": false,
377
+ "special": false
378
+ },
379
+ "50299": {
380
+ "content": "[unused14]",
381
+ "lstrip": false,
382
+ "normalized": true,
383
+ "rstrip": false,
384
+ "single_word": false,
385
+ "special": false
386
+ },
387
+ "50300": {
388
+ "content": "[unused15]",
389
+ "lstrip": false,
390
+ "normalized": true,
391
+ "rstrip": false,
392
+ "single_word": false,
393
+ "special": false
394
+ },
395
+ "50301": {
396
+ "content": "[unused16]",
397
+ "lstrip": false,
398
+ "normalized": true,
399
+ "rstrip": false,
400
+ "single_word": false,
401
+ "special": false
402
+ },
403
+ "50302": {
404
+ "content": "[unused17]",
405
+ "lstrip": false,
406
+ "normalized": true,
407
+ "rstrip": false,
408
+ "single_word": false,
409
+ "special": false
410
+ },
411
+ "50303": {
412
+ "content": "[unused18]",
413
+ "lstrip": false,
414
+ "normalized": true,
415
+ "rstrip": false,
416
+ "single_word": false,
417
+ "special": false
418
+ },
419
+ "50304": {
420
+ "content": "[unused19]",
421
+ "lstrip": false,
422
+ "normalized": true,
423
+ "rstrip": false,
424
+ "single_word": false,
425
+ "special": false
426
+ },
427
+ "50305": {
428
+ "content": "[unused20]",
429
+ "lstrip": false,
430
+ "normalized": true,
431
+ "rstrip": false,
432
+ "single_word": false,
433
+ "special": false
434
+ },
435
+ "50306": {
436
+ "content": "[unused21]",
437
+ "lstrip": false,
438
+ "normalized": true,
439
+ "rstrip": false,
440
+ "single_word": false,
441
+ "special": false
442
+ },
443
+ "50307": {
444
+ "content": "[unused22]",
445
+ "lstrip": false,
446
+ "normalized": true,
447
+ "rstrip": false,
448
+ "single_word": false,
449
+ "special": false
450
+ },
451
+ "50308": {
452
+ "content": "[unused23]",
453
+ "lstrip": false,
454
+ "normalized": true,
455
+ "rstrip": false,
456
+ "single_word": false,
457
+ "special": false
458
+ },
459
+ "50309": {
460
+ "content": "[unused24]",
461
+ "lstrip": false,
462
+ "normalized": true,
463
+ "rstrip": false,
464
+ "single_word": false,
465
+ "special": false
466
+ },
467
+ "50310": {
468
+ "content": "[unused25]",
469
+ "lstrip": false,
470
+ "normalized": true,
471
+ "rstrip": false,
472
+ "single_word": false,
473
+ "special": false
474
+ },
475
+ "50311": {
476
+ "content": "[unused26]",
477
+ "lstrip": false,
478
+ "normalized": true,
479
+ "rstrip": false,
480
+ "single_word": false,
481
+ "special": false
482
+ },
483
+ "50312": {
484
+ "content": "[unused27]",
485
+ "lstrip": false,
486
+ "normalized": true,
487
+ "rstrip": false,
488
+ "single_word": false,
489
+ "special": false
490
+ },
491
+ "50313": {
492
+ "content": "[unused28]",
493
+ "lstrip": false,
494
+ "normalized": true,
495
+ "rstrip": false,
496
+ "single_word": false,
497
+ "special": false
498
+ },
499
+ "50314": {
500
+ "content": "[unused29]",
501
+ "lstrip": false,
502
+ "normalized": true,
503
+ "rstrip": false,
504
+ "single_word": false,
505
+ "special": false
506
+ },
507
+ "50315": {
508
+ "content": "[unused30]",
509
+ "lstrip": false,
510
+ "normalized": true,
511
+ "rstrip": false,
512
+ "single_word": false,
513
+ "special": false
514
+ },
515
+ "50316": {
516
+ "content": "[unused31]",
517
+ "lstrip": false,
518
+ "normalized": true,
519
+ "rstrip": false,
520
+ "single_word": false,
521
+ "special": false
522
+ },
523
+ "50317": {
524
+ "content": "[unused32]",
525
+ "lstrip": false,
526
+ "normalized": true,
527
+ "rstrip": false,
528
+ "single_word": false,
529
+ "special": false
530
+ },
531
+ "50318": {
532
+ "content": "[unused33]",
533
+ "lstrip": false,
534
+ "normalized": true,
535
+ "rstrip": false,
536
+ "single_word": false,
537
+ "special": false
538
+ },
539
+ "50319": {
540
+ "content": "[unused34]",
541
+ "lstrip": false,
542
+ "normalized": true,
543
+ "rstrip": false,
544
+ "single_word": false,
545
+ "special": false
546
+ },
547
+ "50320": {
548
+ "content": "[unused35]",
549
+ "lstrip": false,
550
+ "normalized": true,
551
+ "rstrip": false,
552
+ "single_word": false,
553
+ "special": false
554
+ },
555
+ "50321": {
556
+ "content": "[unused36]",
557
+ "lstrip": false,
558
+ "normalized": true,
559
+ "rstrip": false,
560
+ "single_word": false,
561
+ "special": false
562
+ },
563
+ "50322": {
564
+ "content": "[unused37]",
565
+ "lstrip": false,
566
+ "normalized": true,
567
+ "rstrip": false,
568
+ "single_word": false,
569
+ "special": false
570
+ },
571
+ "50323": {
572
+ "content": "[unused38]",
573
+ "lstrip": false,
574
+ "normalized": true,
575
+ "rstrip": false,
576
+ "single_word": false,
577
+ "special": false
578
+ },
579
+ "50324": {
580
+ "content": "[unused39]",
581
+ "lstrip": false,
582
+ "normalized": true,
583
+ "rstrip": false,
584
+ "single_word": false,
585
+ "special": false
586
+ },
587
+ "50325": {
588
+ "content": "[unused40]",
589
+ "lstrip": false,
590
+ "normalized": true,
591
+ "rstrip": false,
592
+ "single_word": false,
593
+ "special": false
594
+ },
595
+ "50326": {
596
+ "content": "[unused41]",
597
+ "lstrip": false,
598
+ "normalized": true,
599
+ "rstrip": false,
600
+ "single_word": false,
601
+ "special": false
602
+ },
603
+ "50327": {
604
+ "content": "[unused42]",
605
+ "lstrip": false,
606
+ "normalized": true,
607
+ "rstrip": false,
608
+ "single_word": false,
609
+ "special": false
610
+ },
611
+ "50328": {
612
+ "content": "[unused43]",
613
+ "lstrip": false,
614
+ "normalized": true,
615
+ "rstrip": false,
616
+ "single_word": false,
617
+ "special": false
618
+ },
619
+ "50329": {
620
+ "content": "[unused44]",
621
+ "lstrip": false,
622
+ "normalized": true,
623
+ "rstrip": false,
624
+ "single_word": false,
625
+ "special": false
626
+ },
627
+ "50330": {
628
+ "content": "[unused45]",
629
+ "lstrip": false,
630
+ "normalized": true,
631
+ "rstrip": false,
632
+ "single_word": false,
633
+ "special": false
634
+ },
635
+ "50331": {
636
+ "content": "[unused46]",
637
+ "lstrip": false,
638
+ "normalized": true,
639
+ "rstrip": false,
640
+ "single_word": false,
641
+ "special": false
642
+ },
643
+ "50332": {
644
+ "content": "[unused47]",
645
+ "lstrip": false,
646
+ "normalized": true,
647
+ "rstrip": false,
648
+ "single_word": false,
649
+ "special": false
650
+ },
651
+ "50333": {
652
+ "content": "[unused48]",
653
+ "lstrip": false,
654
+ "normalized": true,
655
+ "rstrip": false,
656
+ "single_word": false,
657
+ "special": false
658
+ },
659
+ "50334": {
660
+ "content": "[unused49]",
661
+ "lstrip": false,
662
+ "normalized": true,
663
+ "rstrip": false,
664
+ "single_word": false,
665
+ "special": false
666
+ },
667
+ "50335": {
668
+ "content": "[unused50]",
669
+ "lstrip": false,
670
+ "normalized": true,
671
+ "rstrip": false,
672
+ "single_word": false,
673
+ "special": false
674
+ },
675
+ "50336": {
676
+ "content": "[unused51]",
677
+ "lstrip": false,
678
+ "normalized": true,
679
+ "rstrip": false,
680
+ "single_word": false,
681
+ "special": false
682
+ },
683
+ "50337": {
684
+ "content": "[unused52]",
685
+ "lstrip": false,
686
+ "normalized": true,
687
+ "rstrip": false,
688
+ "single_word": false,
689
+ "special": false
690
+ },
691
+ "50338": {
692
+ "content": "[unused53]",
693
+ "lstrip": false,
694
+ "normalized": true,
695
+ "rstrip": false,
696
+ "single_word": false,
697
+ "special": false
698
+ },
699
+ "50339": {
700
+ "content": "[unused54]",
701
+ "lstrip": false,
702
+ "normalized": true,
703
+ "rstrip": false,
704
+ "single_word": false,
705
+ "special": false
706
+ },
707
+ "50340": {
708
+ "content": "[unused55]",
709
+ "lstrip": false,
710
+ "normalized": true,
711
+ "rstrip": false,
712
+ "single_word": false,
713
+ "special": false
714
+ },
715
+ "50341": {
716
+ "content": "[unused56]",
717
+ "lstrip": false,
718
+ "normalized": true,
719
+ "rstrip": false,
720
+ "single_word": false,
721
+ "special": false
722
+ },
723
+ "50342": {
724
+ "content": "[unused57]",
725
+ "lstrip": false,
726
+ "normalized": true,
727
+ "rstrip": false,
728
+ "single_word": false,
729
+ "special": false
730
+ },
731
+ "50343": {
732
+ "content": "[unused58]",
733
+ "lstrip": false,
734
+ "normalized": true,
735
+ "rstrip": false,
736
+ "single_word": false,
737
+ "special": false
738
+ },
739
+ "50344": {
740
+ "content": "[unused59]",
741
+ "lstrip": false,
742
+ "normalized": true,
743
+ "rstrip": false,
744
+ "single_word": false,
745
+ "special": false
746
+ },
747
+ "50345": {
748
+ "content": "[unused60]",
749
+ "lstrip": false,
750
+ "normalized": true,
751
+ "rstrip": false,
752
+ "single_word": false,
753
+ "special": false
754
+ },
755
+ "50346": {
756
+ "content": "[unused61]",
757
+ "lstrip": false,
758
+ "normalized": true,
759
+ "rstrip": false,
760
+ "single_word": false,
761
+ "special": false
762
+ },
763
+ "50347": {
764
+ "content": "[unused62]",
765
+ "lstrip": false,
766
+ "normalized": true,
767
+ "rstrip": false,
768
+ "single_word": false,
769
+ "special": false
770
+ },
771
+ "50348": {
772
+ "content": "[unused63]",
773
+ "lstrip": false,
774
+ "normalized": true,
775
+ "rstrip": false,
776
+ "single_word": false,
777
+ "special": false
778
+ },
779
+ "50349": {
780
+ "content": "[unused64]",
781
+ "lstrip": false,
782
+ "normalized": true,
783
+ "rstrip": false,
784
+ "single_word": false,
785
+ "special": false
786
+ },
787
+ "50350": {
788
+ "content": "[unused65]",
789
+ "lstrip": false,
790
+ "normalized": true,
791
+ "rstrip": false,
792
+ "single_word": false,
793
+ "special": false
794
+ },
795
+ "50351": {
796
+ "content": "[unused66]",
797
+ "lstrip": false,
798
+ "normalized": true,
799
+ "rstrip": false,
800
+ "single_word": false,
801
+ "special": false
802
+ },
803
+ "50352": {
804
+ "content": "[unused67]",
805
+ "lstrip": false,
806
+ "normalized": true,
807
+ "rstrip": false,
808
+ "single_word": false,
809
+ "special": false
810
+ },
811
+ "50353": {
812
+ "content": "[unused68]",
813
+ "lstrip": false,
814
+ "normalized": true,
815
+ "rstrip": false,
816
+ "single_word": false,
817
+ "special": false
818
+ },
819
+ "50354": {
820
+ "content": "[unused69]",
821
+ "lstrip": false,
822
+ "normalized": true,
823
+ "rstrip": false,
824
+ "single_word": false,
825
+ "special": false
826
+ },
827
+ "50355": {
828
+ "content": "[unused70]",
829
+ "lstrip": false,
830
+ "normalized": true,
831
+ "rstrip": false,
832
+ "single_word": false,
833
+ "special": false
834
+ },
835
+ "50356": {
836
+ "content": "[unused71]",
837
+ "lstrip": false,
838
+ "normalized": true,
839
+ "rstrip": false,
840
+ "single_word": false,
841
+ "special": false
842
+ },
843
+ "50357": {
844
+ "content": "[unused72]",
845
+ "lstrip": false,
846
+ "normalized": true,
847
+ "rstrip": false,
848
+ "single_word": false,
849
+ "special": false
850
+ },
851
+ "50358": {
852
+ "content": "[unused73]",
853
+ "lstrip": false,
854
+ "normalized": true,
855
+ "rstrip": false,
856
+ "single_word": false,
857
+ "special": false
858
+ },
859
+ "50359": {
860
+ "content": "[unused74]",
861
+ "lstrip": false,
862
+ "normalized": true,
863
+ "rstrip": false,
864
+ "single_word": false,
865
+ "special": false
866
+ },
867
+ "50360": {
868
+ "content": "[unused75]",
869
+ "lstrip": false,
870
+ "normalized": true,
871
+ "rstrip": false,
872
+ "single_word": false,
873
+ "special": false
874
+ },
875
+ "50361": {
876
+ "content": "[unused76]",
877
+ "lstrip": false,
878
+ "normalized": true,
879
+ "rstrip": false,
880
+ "single_word": false,
881
+ "special": false
882
+ },
883
+ "50362": {
884
+ "content": "[unused77]",
885
+ "lstrip": false,
886
+ "normalized": true,
887
+ "rstrip": false,
888
+ "single_word": false,
889
+ "special": false
890
+ },
891
+ "50363": {
892
+ "content": "[unused78]",
893
+ "lstrip": false,
894
+ "normalized": true,
895
+ "rstrip": false,
896
+ "single_word": false,
897
+ "special": false
898
+ },
899
+ "50364": {
900
+ "content": "[unused79]",
901
+ "lstrip": false,
902
+ "normalized": true,
903
+ "rstrip": false,
904
+ "single_word": false,
905
+ "special": false
906
+ },
907
+ "50365": {
908
+ "content": "[unused80]",
909
+ "lstrip": false,
910
+ "normalized": true,
911
+ "rstrip": false,
912
+ "single_word": false,
913
+ "special": false
914
+ },
915
+ "50366": {
916
+ "content": "[unused81]",
917
+ "lstrip": false,
918
+ "normalized": true,
919
+ "rstrip": false,
920
+ "single_word": false,
921
+ "special": false
922
+ },
923
+ "50367": {
924
+ "content": "[unused82]",
925
+ "lstrip": false,
926
+ "normalized": true,
927
+ "rstrip": false,
928
+ "single_word": false,
929
+ "special": false
930
+ }
931
+ },
932
+ "clean_up_tokenization_spaces": true,
933
+ "cls_token": "[CLS]",
934
+ "extra_special_tokens": {},
935
+ "mask_token": "[MASK]",
936
+ "max_length": 1024,
937
+ "model_input_names": [
938
+ "input_ids",
939
+ "attention_mask"
940
+ ],
941
+ "model_max_length": 8192,
942
+ "pad_to_multiple_of": null,
943
+ "pad_token": "[PAD]",
944
+ "pad_token_type_id": 0,
945
+ "padding_side": "right",
946
+ "sep_token": "[SEP]",
947
+ "stride": 126,
948
+ "tokenizer_class": "PreTrainedTokenizer",
949
+ "truncation_side": "right",
950
+ "truncation_strategy": "longest_first",
951
+ "unk_token": "[UNK]"
952
+ }