Krelle commited on
Commit
3205558
·
verified ·
1 Parent(s): 302066f

Add new SentenceTransformer model

Browse files
1_Pooling/config.json ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "word_embedding_dimension": 384,
3
+ "pooling_mode_cls_token": false,
4
+ "pooling_mode_mean_tokens": true,
5
+ "pooling_mode_max_tokens": false,
6
+ "pooling_mode_mean_sqrt_len_tokens": false,
7
+ "pooling_mode_weightedmean_tokens": false,
8
+ "pooling_mode_lasttoken": false,
9
+ "include_prompt": true
10
+ }
README.md ADDED
@@ -0,0 +1,1137 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ tags:
3
+ - sentence-transformers
4
+ - sentence-similarity
5
+ - feature-extraction
6
+ - dense
7
+ - generated_from_trainer
8
+ - dataset_size:2778
9
+ - loss:MultipleNegativesRankingLoss
10
+ base_model: intfloat/e5-small-v2
11
+ widget:
12
+ - source_sentence: In the induction example for the sum $1+2+\dots+n$, why do we add
13
+ $n+1$ to both sides of (1.16)?
14
+ sentences:
15
+ - "Subsection 1.6.3: Subsets\n\n\n\nIf $A$ and $B$ are sets, then <footnote>$A\\\
16
+ subseteq B$->At times, the symbol $\\subset$ is used instead of $\\subseteq$.\
17
+ \ In our context these two symbols mean the same. However, the notation $A\\subsetneq\
18
+ \ B$ means that $A\\subseteq B$ and\n $A\\neq B$. For example,\n $\\{1, 2, 3\\\
19
+ } \\subseteq \\{1, 2, 3\\}$ and\n $\\{1, 2, 3\\} \\subset \\{1, 2, 3\\}$.</footnote>\
20
+ \ means that\nevery element of $A$ is an element of $B$. So $A\\subseteq B$ is\
21
+ \ a placeholder for the proposition\n$$\n\\forall x\\in A : x\\in B\n$$\n\nIn\
22
+ \ this case we say\nthat *$A$ is a subset of $B$*. We also use the notation $A\\\
23
+ subsetneq B$ to\nindicate that $A\\subseteq B$ and $A\\neq B$. In this case we\
24
+ \ say that\n$A$ is a *strict* subset of $B$.\n\n\nExercise 1.47:\n\nList the subsets\
25
+ \ of $\\{1, 2\\}$. How many are there?\n\n/Exercise\n\n\nExercise 1.48:\n\nIt\
26
+ \ turns out that the empty set $\\emptyset$ is a subset of any set.\n\n\n\nExplain\
27
+ \ why this is so using the definition of $\\subseteq$.\n\n\\begin{prompting}\n\
28
+ Explain precisely in terms of propositions and logic why the empty set is a subset\
29
+ \ of any given set.\n\\end{prompting}\n\n/Exercise\n\n\nExercise 1.49:\n\nBelow\
30
+ \ Sage (not python) will list all subsets of the set $\\{1, 2, 3\\}$. Before pressing\n\
31
+ the Compute button, try to write them down on your own.\n\n\n\nList all the subsets\
32
+ \ of a set with five elements. In general, how many subsets does a set with $n$\
33
+ \ elements have?\n\n/Exercise\n\n\nQuizexercise 1.50:\n\n\\begin{paragraphquiz}\n\
34
+ \ \\question\n The set \\box is not a subset of $A=$\\box, simply because \\\
35
+ box does not belong to $A$.\n This exercise actually has \\box possible correct\
36
+ \ solutions.\n\n\\answer\n $\\{1, 2, 3\\}$\n \\answer\n $\\{-1, 1, 2, 3, 4\\\
37
+ }$\n \\answer\n $\\{-1, 0, 1, 2, 4\\}$\n \\answer\n $3$\n \\answer\n $-1$\n\
38
+ \ \\answer\n $5$\n \\answer\n $6$\n \\answer\n $0$\n \\case{(is 1347)}{T}\
39
+ \ Correct!\n \\case{(is 2157)}{T} Correct!\n \\case{(is 2347)}{T} Correct!\n\
40
+ \ \\case{(is 3287)}{T} Correct!\n \\case{(is 3157)}{T} Correct!\n \\case{(is\
41
+ \ 3187)}{T} Correct!\n \\default\n Nope. Try again!\n\\end{paragraphquiz}\n\
42
+ \n/Quizexercise\n\n\nQuizexercise 1.51:\n\n\\begin{paragraphquiz}\n \\question\n\
43
+ \ The empty set has \\box elements. A set with \\box elements has \\box subsets.\
44
+ \ In general a set with\n $n$ elements has \\box subsets.\n \\answer\n $1$\n\
45
+ \ \\answer\n $0$\n \\answer\n $5$\n \\answer\n $25$\n \\answer\n $32$\n\
46
+ \ \\answer\n $n^2$\n \\answer\n $2^n$\n \\case{(is 2357)}{T}\n Correct!\n\
47
+ \ \\default\n Nope. Try again!\n\\end{paragraphquiz}\n\n/Quizexercise"
48
+ - "Section 1.8: Proof by induction\n\n\n\nA <footnote>precocious Gauss->See the\
49
+ \ article Gauss's Day of Reckoning: https://www.americanscientist.org/article/gausss-day-of-reckoning\
50
+ \ for some history of this anecdote.</footnote> proved the formula\n$$\n1 + 2\
51
+ \ + \\cdots + n = \\frac{n(n+1)}{2}\n\\tag{1.15}$$\nat the age of seven displaying\
52
+ \ remarkable ingenuity for his age. Lesser\nmortals usually use induction to prove\
53
+ \ this formula. Gauss was asked\nalong with his classmates to compute the sum\
54
+ \ of all natural numbers\n$1, 2, \\dots, 100$. Using his formula he quickly came\
55
+ \ up with the correct\nanswer $5050$. His classmates had to work for the entire\
56
+ \ lesson. \n\nSuppose that the formula in (1.15) is viewed as a\nproposition $p(n)$.\
57
+ \ To prove the formula we need to prove it for all\nnatural numbers (you can easily\
58
+ \ see that $p(1)$ and $p(2)$ are true) i.e.,\nwe need to prove\n$$\n\\forall n\\\
59
+ in \\mathbb{N}: p(n).\n$$\nAn induction proof is a way of proving this statement\
60
+ \ by showing two things:\n\\begin{enumerate}\\item (i)\n $p(1)$\n\\item (ii)\n\
61
+ \ $\\forall n\\in \\mathbb{N}: p(n)\\implies p(n+1)$\n\\end{enumerate}\nThese\
62
+ \ two statements ensure that $p(1) \\implies p(2)$. Therefore\n$p(2)$ must be\
63
+ \ true, since we assumed $p(1)$ true from the\nbeginning. Similarly $p(2)\\implies\
64
+ \ p(3)$ ensures that $p(3)$\nis true and so on. In fact we have proved $p(n)$\
65
+ \ for every $n\\in \\mathbb{N}$\nusing this technique. One can prove this using\
66
+ \ proof by\ncontradiction and that every non-empty subset\nof $\\mathbb{N}$ has\
67
+ \ a first element. In general if $S$ is a subset of set with an order $\\leq$,\
68
+ \ then\n$s\\in S$ is called a first element if\n$$\n\\forall x\\in S: s \\leq\
69
+ \ x.\n$$\nA crucial rule (or axiom) is that every non-empty subset of $\\mathbb{N}$\
70
+ \ has\n a first element! Notice that this is false for $\\mathbb{Z}$.\n\n\nTheorem\
71
+ \ 1.82:\n\n Suppose that $p(n)$ are infinitely many propositions given by $n\\\
72
+ in \\mathbb{N}$. Then\n $$\n \\forall n\\in \\mathbb{N}: p(n)\n $$\n is true\
73
+ \ if\n\\begin{enumerate}\\item (i)\n $p(1)$ is true.\n\\item (ii)\n $\\left(\\\
74
+ forall n\\in \\mathbb{N}: p(n)\\implies p(n+1)\\right)$ is true.\n\\end{enumerate}\n\
75
+ \n/Theorem\n\n\\begin{proof}\nSuppose by contradiction that there exists $n\\\
76
+ in \\mathbb{N}$, such that\n$p(n)$ is false. Then the subset\n$$\nS = \\{n\\in\
77
+ \ \\mathbb{N} \\mid \\neg p(n)\\}\\subseteq \\mathbb{N}\n$$\nis non-empty. Therefore\
78
+ \ it has a first element $n_0\\in S$. \nHere $n_0 > 1$, since $p(1)$ is assumed\
79
+ \ to be true. So we\nknow that $p(n_0-1)$ is true and that\n$p(n_0-1)\\implies\
80
+ \ p(n_0)$ is true. But the latter\nimplication is a contradiction, since true\
81
+ \ implies\nfalse is false.\n\\end{proof}\n\n\n\nLet us see how an induction proof\
82
+ \ plays out in the above example\nwith the statement $p(n)$ that\n$$\n1 + 2 +\
83
+ \ \\cdots + n = \\frac{n(n+1)}{2}.\n\\tag{1.16}$$\nClearly $p(1)$ is true. We\
84
+ \ need to prove $p(n)\\implies p(n+1)$, so\nwe assume that $p(n)$ holds i.e.,\
85
+ \ that (1.16) is true.\nThen we may add $n+1$ to both sides of (1.16) to get\n\
86
+ $$\n1 + 2 + \\cdots + n + (n+1) = \\frac{n(n+1)}{2} + (n+1).\n$$\nHere the right\
87
+ \ hand side can be rewritten as\n$$\n\\frac{n(n+1) + 2(n+1)}{2} = \\frac{(n+1)(n+2)}{2},\n\
88
+ $$\nwhich is exactly what we want. This is the conjectured formula for\nthe sum\
89
+ \ of the numbers $1, 2, \\dots, n, n+1$. Therefore\nwe have proved that $p(n)\\\
90
+ implies p(n+1)$ and the induction\nproof is complete.\n\n\nExample 1.83:\n\n \
91
+ \ For a real number $r\\neq 1$, the extremely useful formula\n $$\n 1 + r +\
92
+ \ \\cdots + r^n = \\frac{1 - r^{n+1}}{1-r}\n \\tag{1.17}$$\n holds. Let us prove\
93
+ \ this formula by induction. For $n=1$ this amounts to the identity\n $$\n 1\
94
+ \ + r = \\frac{1-r^2}{1-r},\n $$\n which is true since $1-r^2 = (1+r)(1-r)$.\
95
+ \ We let $p(n)$ denote\n the identity in (1.17). We have seen that $p(1)$ is\
96
+ \ true. The induction step\n consists in proving $p(n)\\implies p(n+1)$. We can\
97
+ \ prove this\n by adding $r^{n+1}$ to the right hand side in (1.17):\n $$\n\
98
+ \ \\frac{1 - r^{n+1}}{1-r} + r^{n+1} = \\frac{1 - r^{n+1} + (1-r) r^{n+1}}{1-r}\
99
+ \ = \\frac{1 - r^{n+2}}{1-r}.\n \\tag{1.18}$$\nReal life application:\n In\
100
+ \ order to pay for a house you borrow $P$ DKK at an interest of\n $r$ per year.\
101
+ \ You want to pay off your debt over $N$ years by\n paying a fixed amount each\
102
+ \ year. How much is the fixed yearly\n amount you need to pay?\n\nLet us analyze\
103
+ \ the setup: suppose that the fixed yearly amount\n is $Y$. We will find an\
104
+ \ equation giving us $Y$ in terms of\n $P, N$ and $r$. Put $q = 1+ r$.\n\n\
105
+ After one year you owe\n $$\n q P - Y.\n $$\n After two years you\
106
+ \ owe\n $$\n q(q P - Y) - Y.\n $$\n After three years you owe\n \
107
+ \ $$\n q ( q ( q P - Y) - Y) - Y.\n $$\n In general after $n$ years\
108
+ \ you owe\n $$\n q^n P - Y (1 + q + \\cdots + q^{n-1}).\n $$\n Since\
109
+ \ we want to be debt free after $N$ years, the yearly payment will have to satisfy\n\
110
+ \ $$\n q^N P = Y ( 1 + q + \\cdots + q^{N-1}).\n $$\n By the formula\
111
+ \ (1.17), we get\n $$\n q^N P = Y \\frac{1-q^N}{1-q}.\n $$\n Here\
112
+ \ $Y$ can be isolated giving the formula\n $$\n Y = \\frac{r P}{1 - \\left(\\\
113
+ frac{1}{1+r}\\right)^N}.\n $$ \n With the current (August 2024) interest\
114
+ \ rate around four percent, you pay a fixed monthly\n amount of around 4770\
115
+ \ DKK (down from 5420 DKK in 2023, when the interest rate was five percent) for\
116
+ \ borrowing one million DKK over $30$ years.\n \n \n\n/Example"
117
+ - "Subsection 1.9.5: Injective and surjective functions\n\n\n\nWe now define three\
118
+ \ very important notions related to functions.\n\n\nDefinition 1.101:\n\n Let\
119
+ \ $f: S\\rightarrow T$ be a function. Then $f$ is called\n \\begin{enumerate}\\\
120
+ item (i)\n *injective*, if $f(x) = f(y) \\implies x = y$ for every $x, y\\\
121
+ in S$.\n \\item (ii)\n *surjective*, if for every $y\\in T$, there exists\
122
+ \ $x\\in S$, such that $f(x) = y$.\n \\item (iii)\n *bijective*, if it is\
123
+ \ both injective and surjective.\n \\end{enumerate}\n\n/Definition\n\n\nExercise\
124
+ \ 1.102:\n\nIs a cryptographic hash-function as defined in Example (1.92) injective?\n\
125
+ \n/Exercise\n\n\nExercise 1.103:\n\nSuppose that\n$$\nS = \\{1, 2, 3\\}\\qquad\\\
126
+ text{and}\\qquad T = \\{1, 2, 3, 4\\}\n$$\nand that the function $f: S\\rightarrow\
127
+ \ T$ is defined by the table\n$$\n\\def\\arraystretch{1.5}\n\\begin{array}{c|ccccccc}\n\
128
+ x & 1 & 2 & 3\\\\ \\hline\nf(x) & 1 & 2 & 4\n\\end{array}\n$$\nIs $f$ injective?\
129
+ \ Is it surjective? Is it possible to adjust the table so that\n$f$ becomes injective?\n\
130
+ Is it possible to adjust the table so that\n$f$ becomes surjective?\n\n/Exercise\n\
131
+ \n\nExercise 1.104:\n\nConsider the function $f:S \\rightarrow T$ given by\n$$\n\
132
+ f(x) = x^2,\n$$\nwhere $S = T = \\mathbb{R}$.\nIs $f$ injective? Is $f$ surjective?\
133
+ \ Suggest how to change $S$ and $T$ so that $f:S\\rightarrow T$ becomes\nbijective.\n\
134
+ \n/Exercise\n\n\nExercise 1.105:\n\nConsider the function $f:\\mathbb{Z} \\rightarrow\
135
+ \ \\mathbb{Z}$ given by\n$$\nf(x) = x + 1\n$$\nShow that $f$ is bijective.\n\n\
136
+ /Exercise\n\n\nExercise 1.106:\n\nWrite down precisely how the truth table for\
137
+ \ $p\\implies q$ may\nbe expressed in terms of a function $f: S\\rightarrow T$.\
138
+ \ What are the sets $S$ and $T$ in this case?\n\n/Exercise\n\nSubsection 1.9.6:\
139
+ \ The inverse function\n\n\n\nIf $f:S\\rightarrow T$ is bijective, then we may\
140
+ \ define a function $g: T\\rightarrow S$, so\nthat $(f\\circ g)(y) = y$ for every\
141
+ \ $y\\in T$ and $(g\\circ f)(x)$ for every $x\\in S$. This\nfunction is denoted\
142
+ \ $f^{-1}$.\n\nHow do we define $f^{-1}(y)$ for $y\\in T$? Well, since $f$ is\
143
+ \ surjective, we may find\n$x\\in S$ so that $y = f(x)$. Now, we simply define\n\
144
+ $$\nf^{-1}(y) = x.\n\\tag{1.20}$$\nWe cannot have $x_1 \\neq x_2$ in $S$ with\
145
+ \ $f(x_1) = f(x_2) = y$, since $f$ is injective. We only have one choice for\n\
146
+ $x$ in (1.20). Therefore (1.20) really is a good and sound definition.\n\n\nExample\
147
+ \ 1.107:\n\nLet $f: S\\rightarrow S$, where $S = \\{1, 2, 3\\}$ be given by\n\
148
+ the table\n$$\n\\def\\arraystretch{1.5}\n\\begin{array}{c|ccccccc}\nx & 1 & 2\
149
+ \ & 3\\\\ \\hline\nf(x) & 3 & 1 & 2\n\\end{array}.\n$$\nThen $f^{-1}$is given\
150
+ \ by the table\n$$\n\\def\\arraystretch{1.5}\n\\begin{array}{c|ccccccc}\nx & 1\
151
+ \ & 2 & 3\\\\ \\hline\nf^{-1}(x) & 2 & 3 & 1\n\\end{array}.\n$$\n\n/Example\n\n\
152
+ \nExercise 1.108:\n\nWhat if the definition of $f$ in Example (1.107) is changed\
153
+ \ to\n$$\n\\def\\arraystretch{1.5}\n\\begin{array}{c|ccccccc}\nx & 1 & 2 & 3\\\
154
+ \\ \\hline\nf(x) & 3 & 2 & 2\n\\end{array}.\n$$\nDoes $f^{-1}$ make sense here?\n\
155
+ \n/Exercise\n\n\nExercise 1.109:\n\nWhat is the inverse function of $f:\\mathbb{Z}\\\
156
+ rightarrow \\mathbb{Z}$ given by $f(x) = x + 1$?\nWhat is the inverse function\
157
+ \ of $g: S \\rightarrow S$, where $g(x) = \\sqrt{x}$ and\n$S = \\{x\\in \\mathbb{R}\\\
158
+ mid x\\geq 0\\}$?\n\n/Exercise"
159
+ - source_sentence: Why do we need truth tables for these logical connectives—can’t
160
+ we just rely on intuition?
161
+ sentences:
162
+ - "Section 1.5: Propositional logic\n\n\n\nA proposition is a (mathematical) statement\
163
+ \ that is\ntrue ($t$) or false ($f$). This could be a boolean\nexpression in a\
164
+ \ computer program, like $1 < 2$.\n\nSage:\n\n\n\nLater we will see propositions\
165
+ \ with\nvariables in them like $x < 2$. These are called predicates.\n\nPropositions\
166
+ \ can be combined into\nnew (compound) propositions. Take for example the propositions\n\
167
+ \n$$\\begin{aligned}\n&p: \\text{it rains}\\\\\n&q: \\text{it is cloudy}.\n\\\
168
+ end{aligned}$$\n\nThen ($p$ and $q$) is a perfectly good\n new proposition reading\
169
+ \ *it rains and it is cloudy*. The same goes for (if $p$ then $q$), which reads\n\
170
+ \ *if it rains then it is cloudy*. The proposition (if $q$ then $p$) reads *if\
171
+ \ it is cloudy then\n it rains*. This proposition is (clearly) false.\n\nWe\
172
+ \ need some notation to describe these compound propositions:\n\n$$\n\\begin{array}{ll}\n\
173
+ p \\land q\\qquad\\qquad & \\qquad\\qquad p \\text{ and } q\\\\\n\\\\\np \\lor\
174
+ \ q\\qquad\\qquad & \\qquad\\qquad p \\text{ or } q\\\\\n\\\\\np\\implies q\\\
175
+ qquad\\qquad & \\qquad\\qquad \\text{if } p \\text{ then } q\\\\\n\\\\\n\\neg\
176
+ \ p\\qquad\\qquad & \\qquad\\qquad \\text{not } p\n\\end{array}\n$$\n\nThe compound\
177
+ \ propositions are either true($t$) or false ($f$) depending on\n$p$ and $q$.\
178
+ \ The dependencies are displayed in the *truth tables* below.\n\n\nDefinition\
179
+ \ 1.18:\n\n$$\n\\def\\arraystretch{1.2}\n \\begin{array}{c|c|c}\n \
180
+ \ p & q & p\\land q \\\\\n \\hline \n t & t & t \\\\\n \
181
+ \ t & f & f\\\\\n f & t & f\\\\\n f & f & f\n \\end{array}\\\
182
+ qquad\n \\begin{array}{c|c|c}\n p & q & p\\lor q \\\\\n \\\
183
+ hline\n t & t & t \\\\\n t & f & t\\\\\n f & t & t\\\\\
184
+ \n f & f & f\n \\end{array}\n \\qquad\n \\begin{array}{c|c|c}\n\
185
+ \ p & q & p\\implies q \\\\\n \\hline\n t & t & t \\\
186
+ \\\n t & f & f\\\\\n f & t & t\\\\\n f & f & t\n \\\
187
+ end{array}\\qquad\n \\begin{array}{c|c}\n p & \\neg p \\\\\n \
188
+ \ \\hline\n t & f\\\\\n f & t\n \\end{array}\n $$\n\n/Definition\n\
189
+ \nThe tables for the compound propositions $p\\land q, p\\lor q$ and also\n$\\\
190
+ neg p$ are not too hard to grasp. The table for $p\\implies q$ \nraises a few\
191
+ \ more questions. Why is $f\\implies t$ true?\nI will not go into this at this\
192
+ \ point (see Example 1.31), but just point out that there are\nmany explanations\
193
+ \ available online and, \nperhaps more importantly, refer you to Exercise 1.19."
194
+ - 'Subsection 1.7.2: Ordering $\mathbb{Q}$
195
+
196
+
197
+
198
+
199
+ We define the positive rational numbers as
200
+
201
+ $$
202
+
203
+ \mathbb{Q}_+ = \left\{\frac{m}{n} \in \mathbb{Q} \middle| m > 0\right\} = \left\{1,
204
+ \frac{1}{2}, \frac{1}{3}, \frac{2}{3}, \frac{1}{4}, \frac{3}{4}, \dots \right\}.
205
+
206
+ $$
207
+
208
+ One can check that $\mathbb{Q}_+$ satisfies the conditions in
209
+
210
+ Definition (1.69). So formally we
211
+
212
+ get
213
+
214
+
215
+
216
+ Proposition 1.74:
217
+
218
+
219
+ For $\cfrac{a}{b},\,\,\, \cfrac{c}{d}\in \mathbb{Q}$,
220
+
221
+ $$
222
+
223
+ \frac{a}{b}\, < \frac{c}{d}\qquad \iff\qquad a d < b c\qquad (\text{in }\mathbb{Z}).
224
+
225
+ $$
226
+
227
+
228
+ /Proposition
229
+
230
+ \begin{proof}
231
+
232
+ We must check when
233
+
234
+ $$
235
+
236
+ \frac{c}{d} - \frac{a}{b} = \frac{b c - a d}{b d} \in \mathbb{Q}_+.
237
+
238
+ $$
239
+
240
+ This happens precisely when the numerator $b c - a d\in \mathbb{N}$ or $b c -
241
+ a d > 0$. Therefore
242
+
243
+ the condition in the proposition is satsified.
244
+
245
+ \end{proof}'
246
+ - 'Exercise 1.36:
247
+
248
+
249
+ Consider the proposition $q(n) = n \text{ is even}$. Prove that
250
+
251
+ $$
252
+
253
+ \forall n\in \mathbb{Z}: q(n^2)\implies q(n).
254
+
255
+ $$
256
+
257
+
258
+ \begin{hint}
259
+
260
+ Use that $q(n) = \neg p(n)$, where $p(n)$ is defined in Example (1.35).
261
+
262
+ \end{hint}
263
+
264
+
265
+ /Exercise'
266
+ - source_sentence: In Exercise 1.72, how should we correctly rewrite the chain $0
267
+ < 1 < 2$ so that each comparison involves only two integers?
268
+ sentences:
269
+ - "Subsection 1.7.1: Ordering $\\mathbb{Z}$\n\n\n\nAs we saw in Remark (1.70), the\
270
+ \ natural order on $\\mathbb{Z}$ is\ndefined by $\\mathbb{Z}_+ = \\mathbb{N}$,\
271
+ \ so that $x < y$ if $y-x\\in \\mathbb{N}$ for $x, y\\in \\mathbb{Z}$.\nThis completely\
272
+ \ agrees with our preconception that\n$$\n\\cdots < -3 < -2 < -1 < 0 < 1 < 2 <\
273
+ \ \\cdots\n\\tag{1.14}$$\n\nTo be precise, writing $\\cdots < -3 < -2 < -1 < 0\
274
+ \ < 1 < 2 < \\cdots$ is nonsense, since $<$ is only defined for two integers.\n\
275
+ \n\nExercise 1.72:\n\n How is one supposed to interpret $0 < 1 < 2$ for example?\
276
+ \ Go ahead and formulate (1.14) correctly comparing only two integers at a time.\n\
277
+ \ How does Python/Sage interpret $-3 < -2 < -1< 0 < 1 < 2$? Find out using the\
278
+ \ Sage snippet below.\n\n\n\nWhat about $1 < 5 > 3 < 4$? What about $0 < 1 > 2$?\n\
279
+ \n/Exercise\n\n\nQuizexercise 1.73:\n\n\\begin{orderquiz}\n \\question\n Assume\
280
+ \ that $x, y, z\\in \\mathbb{Z}$ and that $x \\leq y$. Then drag and drop the\n\
281
+ \ elements from the left to the right below to explain that\n $x + z \\leq y\
282
+ \ + z$.\n \\answer By assumption $x\\leq y$.\n \\answer This means that\
283
+ \ $z - x + y\\in \\mathbb{N}$\n \\answer This means that $y - x\\in \\mathbb{N}$\n\
284
+ \ \\answer To show that $x + z \\leq y + z$, we need to show that\n $(y +\
285
+ \ z) - (x + z) \\in \\mathbb{N}$.\n \\answer But $(y + z) - (x + z) = y + z\
286
+ \ - x + z$. Therefore,\n \\answer But $(y + z) - (x + z) = y + z - x - z =\
287
+ \ y - x$. Therefore,\n \\answer $(y + z) - (x + z)\\in \\mathbb{N}$, since\n\
288
+ \ \\answer $y - x \\in \\mathbb{N}$\n \\expected{6}\n\n\\case{(is 134678)}{T}\n\
289
+ \ Spot on, my friend.\n\n\\case{(is 467813)}{T}\n This is right!\n\n\\case{(is\
290
+ \ 413678)}{T}\n This is right!\n\n\\default\n Wrong order. Check the definition\
291
+ \ of $\\leq$ in UNDEFINED: ordZ once more!\n\\end{orderquiz}\n\n/Quizexercise"
292
+ - 'Exercise 1.75:
293
+
294
+
295
+ Use proof by contradiction (see section (1.5.8))
296
+
297
+ to show precisely that there does not
298
+
299
+ exist a smallest positive rational number.
300
+
301
+
302
+ /Exercise'
303
+ - "Section 1.9: The concept of a function\n\n\n\nA function is a crucial concept\
304
+ \ in mathematics. In Sage (actually python here) a simple function can be\nprogrammed\
305
+ \ like\n\n\n\nThe code above seems to take a number and returns the number plus\
306
+ \ one. This (f) is in fact a function \ntaking as *input* a number and returning\
307
+ \ as *output* the number plus one. Notice that\nwe do not even know which numbers\
308
+ \ we are talking about here. In mathematics we need to have\na more precise notion\
309
+ \ of a function. \n\nThe above python function could more formally be denoted\
310
+ \ as $f: \\mathbb{Z}\\rightarrow \\mathbb{Z}$ with\n$f(n) = n+1$ if we are dealing\
311
+ \ with the integers, but we cannot tell from the code.\n\nWell, to be fair ...:\n\
312
+ To be completely fair, it is possible from Python 3.5 to add type annotations\
313
+ \ to functions, so that we could write\n<code>def f(n: int) -&gt; int: return(n+1)\n\
314
+ </code>\n\n\nin the Python code to state that the function should take values\
315
+ \ in the integers and return integers.\n\n\nThe precise mathematical definition\
316
+ \ of a function in terms of sets is\nthe following. A function $f: S\\rightarrow\
317
+ \ T$ is a subset\n$f\\subseteq S\\times T$, such that\n$(s, t_1)\\in f \\land\
318
+ \ (s, t_2)\\in f \\implies t_1 = t_2$. In words it states that a\nfunction $f:\
319
+ \ S\\rightarrow T$ is a subset $f$ of $S\\times T$, containing pairs\nhaving only\
320
+ \ one second coordinate for every first coordinate.\n\nThe everyday working definition\
321
+ \ of a\nfunction is more intuitive: a machine taking input from some set\n$S$\
322
+ \ and giving output in some set $T$. The uniqueness of the output\nis encoded\
323
+ \ in the mathematical definition of a function.\n\n\nDefinition 1.90:\n\nMathematically\
324
+ \ a function $f$ takes values from a set $S$ and returns values in a set $T$.\
325
+ \ In details,\nit is denoted $f: S\\rightarrow T$ and the value associated with\
326
+ \ $s\\in S$ is denoted $f(s)\\in T$.\nHere $S$ is called *the domain* of $f$ and\
327
+ \ $T$ is called *the codomain* of $f$. Less,\nformally $S$ is called the input\
328
+ \ set and $T$ the output set for $f$.\n\n/Definition\n\n\nRemark 1.91:\n\n Please\
329
+ \ notice that a function is a very, very general concept. It is not just something\n\
330
+ \ that you draw as a graph on a piece of paper. Of course, you can draw a function\n\
331
+ \ $f:\\mathbb{R}\\rightarrow \\mathbb{R}$ like $f(x) = x^2$:\n \n Generally,\
332
+ \ a function $f: S\\rightarrow T$ is given by a machine, formula or algorithm\
333
+ \ that\n computes $f(x)\\in T$ for every $x\\in S$. Nothing more, nothing less.\
334
+ \ It really has nothing to\n do with a graph (even though graphs can sometimes\
335
+ \ be useful for visualizing certain functions like $f(x) = x^2$).\n\n/Remark\n\
336
+ \n\nExample 1.92:\n\n Good examples of functions can be found in the cryptographic\
337
+ \ hash functions: https://en.wikipedia.org/wiki/Cryptographic_hash_function. They\
338
+ \ are examples of complicated functions $f:S \\rightarrow T$, where\n $S$ is\
339
+ \ infinite and $T$ finite. Here $S$ could be data like plain text files and $T$\
340
+ \ could be\n a $256$ bit number. This is the setup for the widely used sha-256\
341
+ \ cryptographic hash function.\n The whole point of a cryptographic hash function\
342
+ \ is that it must be humanly impossible to\n <footnote>compute $y$ with $f(y)\
343
+ \ = f(x)$ given $f(x)$->A pair $x\\neq y$ with $f(x) = f(y)$ is called a collision</footnote>.\
344
+ \ \n In fact, sha-256 is used in the Bitcoin block chain. The precise definition\
345
+ \ of\n sha-256 can be found in FIPS PUB 180-4: http://nvlpubs.nist.gov/nistpubs/FIPS/NIST.FIPS.180-4.pdf\
346
+ \ approved by the Secretary of Commerce.\n\nOther interesting functions output\
347
+ \ a bounded size digital footprint (checksum) of a file (like md5: https://en.wikipedia.org/wiki/MD5).\
348
+ \ This is very useful\nfor checking data integrity of downloads over the internet.\
349
+ \ The md5 hash is a $128$ bit number.\n\nInstead of listing $256$ or $128$ bits\
350
+ \ for the hash value one uses hexadecimal notation with digits\nin 0, 1, 2, 3,\
351
+ \ 4, 5, 6, 7, 8, 9 , a, b, c, d, e, f. A pair of hexadecimal digits then represents\n\
352
+ a byte or $8$ bits. Output from sha-256 and md5 consist of $64$ and $32$ hexadecimal\n\
353
+ digits respectively. You are welcome to experiment with these two hash functions\
354
+ \ in the\nSage window below.\n\n\n\n\n/Example\n\n\nExercise 1.93:\n\nWhat is\
355
+ \ the sha-256 hash of your name? Change a\nfew letters and recompute. Do you see\
356
+ \ any system? What about the md5 hash function?\nCan you find two different strings\
357
+ \ with the same md5 hash using your computer?\n\n\\begin{hint}\n I have not answered\
358
+ \ the last question myself, but I am told that it is possible to find\n a collision\
359
+ \ for md5 using a garden variety home computer. Browsing the internet, it\n seems\
360
+ \ that the two strings $s_1$ and $s_2$ given in <footnote>hexadecimal notation->This\
361
+ \ notation represents a sequence of bytes given by pairs of hexadecimal digits</footnote>\
362
+ \ by\n<code>d131dd02c5e6eec4693d9a0698aff95c2fcab58712467eab4004583eb8fb7f89 \n\
363
+ 55ad340609f4b30283e488832571415a085125e8f7cdc99fd91dbdf280373c5b \nd8823e3156348f5bae6dacd436c919c6dd53e2b487da03fd02396306d248cda0\
364
+ \ \ne99f33420f577ee8ce54b67080a80d1ec69821bcb6a8839396f9652b6ff72a70\n</code>\n\
365
+ \n\nand\n<code>d131dd02c5e6eec4693d9a0698aff95c2fcab50712467eab4004583eb8fb7f89\
366
+ \ \n55ad340609f4b30283e4888325f1415a085125e8f7cdc99fd91dbd7280373c5b \nd8823e3156348f5bae6dacd436c919c6dd53e23487da03fd02396306d248cda0\
367
+ \ \ne99f33420f577ee8ce54b67080280d1ec69821bcb6a8839396f965ab6ff72a70 \n</code>\n\
368
+ \n\ngive a collision for md5. Verify that $s_1\\neq s_2$ and\nthat they give the\
369
+ \ same md5 hash. If you find a collision\nfor sha-256 you will\nbecome world famous.\n\
370
+ \n\\begin{hint}\n\n\\end{hint}\n\\end{hint}\n\n/Exercise"
371
+ - source_sentence: Why does the textbook emphasize that the ordering in the listing
372
+ of elements is unimportant?
373
+ sentences:
374
+ - 'Exercise 1.14:
375
+
376
+
377
+ We know that zero times any number is zero. Deduce this from the rules in
378
+
379
+ Proposition (1.12) starting with $0 + 0 = 0$.
380
+
381
+
382
+ /Exercise'
383
+ - 'A set is a collection of (mathematical) objects or elements. When defining a
384
+ set we use the symbols $\{$
385
+
386
+ and $\}$ to denote the beginning and end of its definition. For example, $\{\text{N,
387
+ i, e, l, s}\}$
388
+
389
+ is the set of characters in my first name and $\{8, 0\}$ are the digits in the
390
+ postal code
391
+
392
+ for Aarhus C. The ordering in the listing of the elements is unimportant so that
393
+
394
+ $$\begin{aligned}
395
+
396
+ \{\text{N, i, e, l, s}\} &= \{\text{l, e, i, s, N}\}\\
397
+
398
+ \{8, 0\} &= \{0, 8\}
399
+
400
+ \end{aligned}$$
401
+
402
+ are identical sets. If $S$ is a set, we will use the notation $x\in S$ to denote
403
+
404
+ that $x$ is an element in $S$. For example, e$\in \{\text{N, i, e, l, s}\}$.'
405
+ - "Exercise 1.84:\n\nVerify the computation (induction step) in (1.18) i.e., explain\n\
406
+ the operations used to go from the left to the right of the two equalities.\n\n\
407
+ /Exercise\n\n\nExercise 1.85:\n\nLocate the mistake in the following fake induction\
408
+ \ proof of the curious fact that\n$2^n = 2$ \nfor every $n\\in \\mathbb{N}$.\n\
409
+ \nLet $p(n)$ be\nthe proposition $2^n = 2$. Then $p(1)$ is true.\n\nWe wish to\
410
+ \ prove that $p(n) \\implies p(n+1)$ assuming that $p(1), \\dots, p(n)$ are true:\n\
411
+ $$\\begin{aligned}\n 2^{n+1} &= 2^n \\cdot 2\\\\\n &= 2^n \\cdot \\\
412
+ frac{2^n}{2^{n-1}}\\\\\n &= 2 \\cdot \\frac{2}{2}\\,\\,\\text{(by }p(n)\\\
413
+ text{ and }p(n-1)\\text{)}\\\\\n &= 2.\n\\end{aligned}$$\nThis shows\
414
+ \ that $p(n) \\implies p(n+1)$ and therefore that $2^n = 2$ for every\n$n\\in\
415
+ \ \\mathbb{N}\\setminus\\{0\\}$.\n\n/Exercise\n\n\nExercise 1.86:\n\nProve by\
416
+ \ induction that the sum of the first $n$ odd numbers is\ngiven by the formula\n\
417
+ $$\n1 + 3 + \\cdots + (2 n - 1) = n^2,\n$$\ni.e., for $n=5$ we have\n$$\n1 + 3\
418
+ \ + 5 + 7 + 9 = 25.\n$$\n\n/Exercise\n\n\nExercise 1.87:\n\nProve by induction\
419
+ \ that\n$$\n1^2 + 2^2 + 3^2 + \\cdots + n^2 = \\frac{n(n+1)(2n + 1)}{6}.\n$$\n\
420
+ \n/Exercise\n\n\nExercise 1.88:\n\nProve using the idea of induction that\n$$\n\
421
+ 2^n < n!\n$$\nfor $n\\geq 4$.\n\n\n/Exercise\n\nThe last exercise related to induction\
422
+ \ concerns the famous pigeonhole principle: https://en.wikipedia.org/wiki/Pigeonhole_principle.\
423
+ \ The statement itself looks innocent, well almost ridiculous, but it is very\
424
+ \ powerful: https://mindyourdecisions.com/blog/2008/11/25/16-fun-applications-of-the-pigeonhole-principle/.\
425
+ \ Even the go-to website \nmathoverflow: https://mathoverflow.net/ for research\
426
+ \ mathematicians has \na quite nice thread: https://mathoverflow.net/questions/4279/interesting-applications-of-the-pigeonhole-principle\
427
+ \ \nabout this.\n\n\nExercise 1.89:\n\nProve the following by induction on $m$:\
428
+ \ if $n$ items are put into $m$ containers and \n$n > m$, then at least one container\
429
+ \ must contain more than one item.\n\n/Exercise"
430
+ - source_sentence: In Exercise 1.41 we are asked to find identities for \((a+b)^3\)
431
+ and \((a+b)^4\). What are the correct expanded forms, and how do they relate to
432
+ the binomial theorem?
433
+ sentences:
434
+ - 'Chapter 1 on the language of mathematics is an introduction to the fundamental
435
+ mathematics used in the notes.
436
+
437
+ Without understanding the basic concepts in it, you do not have the background
438
+ to understand
439
+
440
+ the rest of the notes. Important highlights from the chapter are
441
+
442
+
443
+ - Introduction to prompting. This is your ticket to using large language models
444
+ effectively
445
+
446
+ - How to use computer algebra (Sage). Sage can be very helpful in understanding
447
+ the mathematics
448
+
449
+ - Introduction of the numbers we use. Here the natural numbers, integers, rationals
450
+ and real numbers are defined. Also the arithmetic rules for using them are given
451
+
452
+ - Logic is the framework for reasoning in mathematics. Study this! First comes
453
+ propositional logic. This is basic logic involving true and false statements with
454
+ and, or etc as seen in truth tables. Then comes predicate logic, where variables
455
+ are used. Here you must learn the meaning of "for every" and "there exists"
456
+
457
+ - Proofs are described. Proof by contradiction is a must here! Do not skip it
458
+
459
+ - The language of sets. Learn the operations on sets. Especially focus on the
460
+ set builder notation and products of sets
461
+
462
+ - Ordering of numbers. This is the formal definition of comparing numbers
463
+
464
+ - Proof by induction. How to prove infinitely many propositions involving the
465
+ natural numbers with one hack
466
+
467
+ - The concept of a function. This is extremely important. Notice that a function
468
+ is defined not by a rule. Also, in its definition enters crucially where it is
469
+ defined
470
+
471
+ - Functions from and into products
472
+
473
+ - The preimage. This will become very important working with continuous functions'
474
+ - "Exercise 1.84:\n\nVerify the computation (induction step) in (1.18) i.e., explain\n\
475
+ the operations used to go from the left to the right of the two equalities.\n\n\
476
+ /Exercise\n\n\nExercise 1.85:\n\nLocate the mistake in the following fake induction\
477
+ \ proof of the curious fact that\n$2^n = 2$ \nfor every $n\\in \\mathbb{N}$.\n\
478
+ \nLet $p(n)$ be\nthe proposition $2^n = 2$. Then $p(1)$ is true.\n\nWe wish to\
479
+ \ prove that $p(n) \\implies p(n+1)$ assuming that $p(1), \\dots, p(n)$ are true:\n\
480
+ $$\\begin{aligned}\n 2^{n+1} &= 2^n \\cdot 2\\\\\n &= 2^n \\cdot \\\
481
+ frac{2^n}{2^{n-1}}\\\\\n &= 2 \\cdot \\frac{2}{2}\\,\\,\\text{(by }p(n)\\\
482
+ text{ and }p(n-1)\\text{)}\\\\\n &= 2.\n\\end{aligned}$$\nThis shows\
483
+ \ that $p(n) \\implies p(n+1)$ and therefore that $2^n = 2$ for every\n$n\\in\
484
+ \ \\mathbb{N}\\setminus\\{0\\}$.\n\n/Exercise\n\n\nExercise 1.86:\n\nProve by\
485
+ \ induction that the sum of the first $n$ odd numbers is\ngiven by the formula\n\
486
+ $$\n1 + 3 + \\cdots + (2 n - 1) = n^2,\n$$\ni.e., for $n=5$ we have\n$$\n1 + 3\
487
+ \ + 5 + 7 + 9 = 25.\n$$\n\n/Exercise\n\n\nExercise 1.87:\n\nProve by induction\
488
+ \ that\n$$\n1^2 + 2^2 + 3^2 + \\cdots + n^2 = \\frac{n(n+1)(2n + 1)}{6}.\n$$\n\
489
+ \n/Exercise\n\n\nExercise 1.88:\n\nProve using the idea of induction that\n$$\n\
490
+ 2^n < n!\n$$\nfor $n\\geq 4$.\n\n\n/Exercise\n\nThe last exercise related to induction\
491
+ \ concerns the famous pigeonhole principle: https://en.wikipedia.org/wiki/Pigeonhole_principle.\
492
+ \ The statement itself looks innocent, well almost ridiculous, but it is very\
493
+ \ powerful: https://mindyourdecisions.com/blog/2008/11/25/16-fun-applications-of-the-pigeonhole-principle/.\
494
+ \ Even the go-to website \nmathoverflow: https://mathoverflow.net/ for research\
495
+ \ mathematicians has \na quite nice thread: https://mathoverflow.net/questions/4279/interesting-applications-of-the-pigeonhole-principle\
496
+ \ \nabout this.\n\n\nExercise 1.89:\n\nProve the following by induction on $m$:\
497
+ \ if $n$ items are put into $m$ containers and \n$n > m$, then at least one container\
498
+ \ must contain more than one item.\n\n/Exercise"
499
+ - "Section 1.6: More on sets\n\n\n\nPropositions are important, but are confined\
500
+ \ by the binary values\nof true and false. We would like to work mathematically\
501
+ \ with \nobjects like integers, floating point numbers, neural networks,\ncomputer\
502
+ \ programs and so on.\n\nSubsection 1.6.1: Objects and equality\n\n\n\nOne of\
503
+ \ the cornerstones of modern mathematics is\ndeciding when two objects are the\
504
+ \ same i.e.,\ngiven two objects $A$ and $B$, deciding whether\nthe proposition\
505
+ \ $A=B$ is true of false. Oftentimes\nan algorithm for evaluating $A=B$ is needed.\n\
506
+ \nYou may laugh here, but this is\nnot always that easy. Even though objects appear\
507
+ \ different they are the same as\nin, for example the propositions\n$$\n\\frac{105}{189}\
508
+ \ = \\frac{35}{63}\\qquad\\text{and}\\qquad \\sin\\left(\\frac{\\pi}{2}\\right)\
509
+ \ = 1.\n$$\nThe first proposition above is an identity of fractions (rational\
510
+ \ numbers). The second is\nan identity, which calls for knowledge of the sine\
511
+ \ function and real numbers. Each of these\nidentities calls for some rather advanced\
512
+ \ mathematics. The first proposition is true in\na very precise way, since $105\\\
513
+ cdot 63 = 189 \\cdot 35$.\n\n\nExercise 1.40:\n\n\n\n\nUse the Sage window above\
514
+ \ to reason \nabout equality in the quiz below. In each case describe the objects\
515
+ \ i.e.,\nare they numbers, symbols, etc.? Also, please check your computations\n\
516
+ by hand with the old fashioned paper and pencil, especially $(a+b)(a-b)$.\n\n\\\
517
+ begin{quiz}\n\\question\nClick on the right equalities below.\n\\answer{T}\n$$a\
518
+ \ + b - 2 b = a - b$$\n\\answer{F}\n$$(a+b)^2 = a^2 + b^2$$\n\\answer{T}\n$$(a\
519
+ \ + b)(a - b) = a^2 - b^2$$\n\\answer{T}\n$$(a + b)^2 = a^2 + 2 a b + b^2$$\n\
520
+ \\answer{F}\n$$(a+b)^3 = a^3 + 2 a^2 b + 2 a b^2 + b^3$$\n\\answer{F}\n$$\\frac{3}{8}\
521
+ \ = \\frac{5}{13}$$ \n\\answer{F}\n$$\n\\pi = \\frac{22}{7}\n$$\n\\answer{T}\n\
522
+ $$\n\\cos^2(\\pi) + \\sin^2(\\pi) = 1\n$$\n\\end{quiz}\n\n/Exercise\n\n\nExercise\
523
+ \ 1.41:\n\nYou know that $(a+ b)^2 = a^2 + 2 a b + b^2$. Use Sage to find a similar\
524
+ \ identities\nfor $(a + b)^3$ and $(a + b)^4$.\n\n\\begin{hint}\n Go back and\
525
+ \ look at (the beginning of) Exercise (1.40).\n\\end{hint}\n\n/Exercise\n\nFor\
526
+ \ two objects $A$ and $B$ we will use the notation $A \\neq B$ for the proposition\
527
+ \ $\\neg (A = B)$.\n\nWe have already defined a set (informally) as a collection\
528
+ \ of distinct objects or *elements*.\nWe introduce some more set theory here.\n\
529
+ A set\nis also an object as described in section (1.6.1) and it makes sense to\n\
530
+ ask when two sets are equal.\n\n\nDefinition 1.42:\n\nTwo sets $A$ and $B$ are\
531
+ \ equal i.e., $A = B$ if they contain the same elements.\n\n/Definition\n\nAn\
532
+ \ example of a set could be \nthe set $\\{1,2,3\\}$ of natural numbers between\
533
+ \ $0$ and $4$. Notice again that we use the symbol\n\"$\\{$\" to start the listing\
534
+ \ of elements in a set and the symbol \"$\\}$\" to denote the end of the listing.\n\
535
+ Notice also that (by our definition of equality between sets), the order of the\
536
+ \ elements in the listing does not matter i.e.,\n$$\n\\{1, 2, 3\\} = \\{2, 3,\
537
+ \ 1\\}.\n$$\nWe are also not allowing duplicates like for\nexample in the listing\
538
+ \ $\\{1, 2, 2, 3, 3, 3\\}$ (such a thing is called a multiset: https://en.m.wikipedia.org/wiki/Multiset).\n\
539
+ \nAn example of a set not involving numbers could be the set of letters \n$$\n\
540
+ S=\\{A, n, e, x, a, m, p, l, c, o, u, d, b, t, h, s, r, i\\}\n$$ \nused in this\
541
+ \ sentence. The number of elements in a set $S$ is called the *cardinality* of\
542
+ \ the set.\nWe will denote it by $|S|$.\n\nTo convince someone beyond a doubt\
543
+ \ (we will talk about this formally later in this chapter) that two sets $A$ and\
544
+ \ $B$ are equal, one needs to argue that if $x$ is an element of $A$, then $x$\
545
+ \ is an element of $B$ and the other way round, if $y$ is an element of $B$, then\
546
+ \ $y$ is an element of $A$. If this is true, then\n$A$ and $B$ must contain the\
547
+ \ same elements.\n\n\nExercise 1.43:\n\nGive a precise reason as to why the two\
548
+ \ sets $\\{1, 2, 3\\}$ and $\\{1, 2, 4\\}$ are not equal.\nIs it possible for\
549
+ \ a set with $5$ elements to be equal to a set with $7$ elements?\n\n/Exercise\
550
+ \ \n\nSets may be explored using (only) python. This is illustrated in the snippet\
551
+ \ below. \n\n<a href=\"#a314f450-54ad-4acd-bbf0-475e00ac5949\" class =\"btn btn-default\
552
+ \ Sagebutton\" data-toggle=\"collapse\"></a><div id=a314f450-54ad-4acd-bbf0-475e00ac5949\
553
+ \ class = \"collapse Sage envbuttons\"><div class=sagepython><script type=\"text/x-sage\"\
554
+ >\nX = {1, 2, 3}\nY = {2, 3, 1}\nprint(\"X=Y is \", X==Y)\n\nS = {'A','n','e','x','a','m','p','l','c','o','u','d','b','t','h','s','r','i'}\n\
555
+ print(\"S = \", S) \nprint(\"The number of elements in S is |S|=\", len(S))\n\
556
+ </script></div></div>\n\n\n\nExercise 1.44:\n\nCome up with three lines of Sage\
557
+ \ code that verifies $\\{1, 2, 3\\} \\neq \\{1, 2, 4\\}$. Try it out.\n\n/Exercise"
558
+ pipeline_tag: sentence-similarity
559
+ library_name: sentence-transformers
560
+ metrics:
561
+ - cosine_accuracy@1
562
+ - cosine_accuracy@3
563
+ - cosine_accuracy@5
564
+ - cosine_accuracy@10
565
+ - cosine_precision@1
566
+ - cosine_precision@3
567
+ - cosine_precision@5
568
+ - cosine_precision@10
569
+ - cosine_recall@1
570
+ - cosine_recall@3
571
+ - cosine_recall@5
572
+ - cosine_recall@10
573
+ - cosine_ndcg@3
574
+ - cosine_ndcg@5
575
+ - cosine_ndcg@10
576
+ - cosine_mrr@3
577
+ - cosine_mrr@5
578
+ - cosine_mrr@10
579
+ - cosine_map@100
580
+ model-index:
581
+ - name: SentenceTransformer based on intfloat/e5-small-v2
582
+ results:
583
+ - task:
584
+ type: information-retrieval
585
+ name: Information Retrieval
586
+ dataset:
587
+ name: Unknown
588
+ type: unknown
589
+ metrics:
590
+ - type: cosine_accuracy@1
591
+ value: 0.6908315565031983
592
+ name: Cosine Accuracy@1
593
+ - type: cosine_accuracy@3
594
+ value: 0.8347547974413646
595
+ name: Cosine Accuracy@3
596
+ - type: cosine_accuracy@5
597
+ value: 0.8880597014925373
598
+ name: Cosine Accuracy@5
599
+ - type: cosine_accuracy@10
600
+ value: 0.9253731343283582
601
+ name: Cosine Accuracy@10
602
+ - type: cosine_precision@1
603
+ value: 0.6908315565031983
604
+ name: Cosine Precision@1
605
+ - type: cosine_precision@3
606
+ value: 0.27825159914712155
607
+ name: Cosine Precision@3
608
+ - type: cosine_precision@5
609
+ value: 0.17761194029850746
610
+ name: Cosine Precision@5
611
+ - type: cosine_precision@10
612
+ value: 0.09253731343283582
613
+ name: Cosine Precision@10
614
+ - type: cosine_recall@1
615
+ value: 0.6908315565031983
616
+ name: Cosine Recall@1
617
+ - type: cosine_recall@3
618
+ value: 0.8347547974413646
619
+ name: Cosine Recall@3
620
+ - type: cosine_recall@5
621
+ value: 0.8880597014925373
622
+ name: Cosine Recall@5
623
+ - type: cosine_recall@10
624
+ value: 0.9253731343283582
625
+ name: Cosine Recall@10
626
+ - type: cosine_ndcg@3
627
+ value: 0.7763328209983278
628
+ name: Cosine Ndcg@3
629
+ - type: cosine_ndcg@5
630
+ value: 0.7980285423533605
631
+ name: Cosine Ndcg@5
632
+ - type: cosine_ndcg@10
633
+ value: 0.8099677414320194
634
+ name: Cosine Ndcg@10
635
+ - type: cosine_mrr@3
636
+ value: 0.7560412224591327
637
+ name: Cosine Mrr@3
638
+ - type: cosine_mrr@5
639
+ value: 0.7679282160625444
640
+ name: Cosine Mrr@5
641
+ - type: cosine_mrr@10
642
+ value: 0.7727815006599654
643
+ name: Cosine Mrr@10
644
+ - type: cosine_map@100
645
+ value: 0.7764057538430271
646
+ name: Cosine Map@100
647
+ - type: cosine_accuracy@1
648
+ value: 0.6663078579117331
649
+ name: Cosine Accuracy@1
650
+ - type: cosine_accuracy@3
651
+ value: 0.8116254036598493
652
+ name: Cosine Accuracy@3
653
+ - type: cosine_accuracy@5
654
+ value: 0.8697524219590959
655
+ name: Cosine Accuracy@5
656
+ - type: cosine_accuracy@10
657
+ value: 0.9117330462863293
658
+ name: Cosine Accuracy@10
659
+ - type: cosine_precision@1
660
+ value: 0.6663078579117331
661
+ name: Cosine Precision@1
662
+ - type: cosine_precision@3
663
+ value: 0.27054180121994975
664
+ name: Cosine Precision@3
665
+ - type: cosine_precision@5
666
+ value: 0.17395048439181915
667
+ name: Cosine Precision@5
668
+ - type: cosine_precision@10
669
+ value: 0.09117330462863295
670
+ name: Cosine Precision@10
671
+ - type: cosine_recall@1
672
+ value: 0.6663078579117331
673
+ name: Cosine Recall@1
674
+ - type: cosine_recall@3
675
+ value: 0.8116254036598493
676
+ name: Cosine Recall@3
677
+ - type: cosine_recall@5
678
+ value: 0.8697524219590959
679
+ name: Cosine Recall@5
680
+ - type: cosine_recall@10
681
+ value: 0.9117330462863293
682
+ name: Cosine Recall@10
683
+ - type: cosine_ndcg@3
684
+ value: 0.7519327635399075
685
+ name: Cosine Ndcg@3
686
+ - type: cosine_ndcg@5
687
+ value: 0.7761647660928707
688
+ name: Cosine Ndcg@5
689
+ - type: cosine_ndcg@10
690
+ value: 0.7896982949533798
691
+ name: Cosine Ndcg@10
692
+ - type: cosine_mrr@3
693
+ value: 0.7312522425547185
694
+ name: Cosine Mrr@3
695
+ - type: cosine_mrr@5
696
+ value: 0.7448690348044498
697
+ name: Cosine Mrr@5
698
+ - type: cosine_mrr@10
699
+ value: 0.7504190373673694
700
+ name: Cosine Mrr@10
701
+ - type: cosine_map@100
702
+ value: 0.7541642286378775
703
+ name: Cosine Map@100
704
+ ---
705
+
706
+ # SentenceTransformer based on intfloat/e5-small-v2
707
+
708
+ This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [intfloat/e5-small-v2](https://huggingface.co/intfloat/e5-small-v2). It maps sentences & paragraphs to a 384-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
709
+
710
+ ## Model Details
711
+
712
+ ### Model Description
713
+ - **Model Type:** Sentence Transformer
714
+ - **Base model:** [intfloat/e5-small-v2](https://huggingface.co/intfloat/e5-small-v2) <!-- at revision ffb93f3bd4047442299a41ebb6fa998a38507c52 -->
715
+ - **Maximum Sequence Length:** 512 tokens
716
+ - **Output Dimensionality:** 384 dimensions
717
+ - **Similarity Function:** Cosine Similarity
718
+ <!-- - **Training Dataset:** Unknown -->
719
+ <!-- - **Language:** Unknown -->
720
+ <!-- - **License:** Unknown -->
721
+
722
+ ### Model Sources
723
+
724
+ - **Documentation:** [Sentence Transformers Documentation](https://sbert.net)
725
+ - **Repository:** [Sentence Transformers on GitHub](https://github.com/huggingface/sentence-transformers)
726
+ - **Hugging Face:** [Sentence Transformers on Hugging Face](https://huggingface.co/models?library=sentence-transformers)
727
+
728
+ ### Full Model Architecture
729
+
730
+ ```
731
+ SentenceTransformer(
732
+ (0): Transformer({'max_seq_length': 512, 'do_lower_case': False, 'architecture': 'BertModel'})
733
+ (1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
734
+ (2): Normalize()
735
+ )
736
+ ```
737
+
738
+ ## Usage
739
+
740
+ ### Direct Usage (Sentence Transformers)
741
+
742
+ First install the Sentence Transformers library:
743
+
744
+ ```bash
745
+ pip install -U sentence-transformers
746
+ ```
747
+
748
+ Then you can load this model and run inference.
749
+ ```python
750
+ from sentence_transformers import SentenceTransformer
751
+
752
+ # Download from the 🤗 Hub
753
+ model = SentenceTransformer("Krelle/e5-small-v2-imo-pairs")
754
+ # Run inference
755
+ sentences = [
756
+ 'In Exercise\u202f1.41 we are asked to find identities for \\((a+b)^3\\) and \\((a+b)^4\\). What are the correct expanded forms, and how do they relate to the binomial theorem?',
757
+ 'Section 1.6: More on sets\n\n\n\nPropositions are important, but are confined by the binary values\nof true and false. We would like to work mathematically with \nobjects like integers, floating point numbers, neural networks,\ncomputer programs and so on.\n\nSubsection 1.6.1: Objects and equality\n\n\n\nOne of the cornerstones of modern mathematics is\ndeciding when two objects are the same i.e.,\ngiven two objects $A$ and $B$, deciding whether\nthe proposition $A=B$ is true of false. Oftentimes\nan algorithm for evaluating $A=B$ is needed.\n\nYou may laugh here, but this is\nnot always that easy. Even though objects appear different they are the same as\nin, for example the propositions\n$$\n\\frac{105}{189} = \\frac{35}{63}\\qquad\\text{and}\\qquad \\sin\\left(\\frac{\\pi}{2}\\right) = 1.\n$$\nThe first proposition above is an identity of fractions (rational numbers). The second is\nan identity, which calls for knowledge of the sine function and real numbers. Each of these\nidentities calls for some rather advanced mathematics. The first proposition is true in\na very precise way, since $105\\cdot 63 = 189 \\cdot 35$.\n\n\nExercise 1.40:\n\n\n\n\nUse the Sage window above to reason \nabout equality in the quiz below. In each case describe the objects i.e.,\nare they numbers, symbols, etc.? Also, please check your computations\nby hand with the old fashioned paper and pencil, especially $(a+b)(a-b)$.\n\n\\begin{quiz}\n\\question\nClick on the right equalities below.\n\\answer{T}\n$$a + b - 2 b = a - b$$\n\\answer{F}\n$$(a+b)^2 = a^2 + b^2$$\n\\answer{T}\n$$(a + b)(a - b) = a^2 - b^2$$\n\\answer{T}\n$$(a + b)^2 = a^2 + 2 a b + b^2$$\n\\answer{F}\n$$(a+b)^3 = a^3 + 2 a^2 b + 2 a b^2 + b^3$$\n\\answer{F}\n$$\\frac{3}{8} = \\frac{5}{13}$$ \n\\answer{F}\n$$\n\\pi = \\frac{22}{7}\n$$\n\\answer{T}\n$$\n\\cos^2(\\pi) + \\sin^2(\\pi) = 1\n$$\n\\end{quiz}\n\n/Exercise\n\n\nExercise 1.41:\n\nYou know that $(a+ b)^2 = a^2 + 2 a b + b^2$. Use Sage to find a similar identities\nfor $(a + b)^3$ and $(a + b)^4$.\n\n\\begin{hint}\n Go back and look at (the beginning of) Exercise (1.40).\n\\end{hint}\n\n/Exercise\n\nFor two objects $A$ and $B$ we will use the notation $A \\neq B$ for the proposition $\\neg (A = B)$.\n\nWe have already defined a set (informally) as a collection of distinct objects or *elements*.\nWe introduce some more set theory here.\nA set\nis also an object as described in section (1.6.1) and it makes sense to\nask when two sets are equal.\n\n\nDefinition 1.42:\n\nTwo sets $A$ and $B$ are equal i.e., $A = B$ if they contain the same elements.\n\n/Definition\n\nAn example of a set could be \nthe set $\\{1,2,3\\}$ of natural numbers between $0$ and $4$. Notice again that we use the symbol\n"$\\{$" to start the listing of elements in a set and the symbol "$\\}$" to denote the end of the listing.\nNotice also that (by our definition of equality between sets), the order of the elements in the listing does not matter i.e.,\n$$\n\\{1, 2, 3\\} = \\{2, 3, 1\\}.\n$$\nWe are also not allowing duplicates like for\nexample in the listing $\\{1, 2, 2, 3, 3, 3\\}$ (such a thing is called a multiset: https://en.m.wikipedia.org/wiki/Multiset).\n\nAn example of a set not involving numbers could be the set of letters \n$$\nS=\\{A, n, e, x, a, m, p, l, c, o, u, d, b, t, h, s, r, i\\}\n$$ \nused in this sentence. The number of elements in a set $S$ is called the *cardinality* of the set.\nWe will denote it by $|S|$.\n\nTo convince someone beyond a doubt (we will talk about this formally later in this chapter) that two sets $A$ and $B$ are equal, one needs to argue that if $x$ is an element of $A$, then $x$ is an element of $B$ and the other way round, if $y$ is an element of $B$, then $y$ is an element of $A$. If this is true, then\n$A$ and $B$ must contain the same elements.\n\n\nExercise 1.43:\n\nGive a precise reason as to why the two sets $\\{1, 2, 3\\}$ and $\\{1, 2, 4\\}$ are not equal.\nIs it possible for a set with $5$ elements to be equal to a set with $7$ elements?\n\n/Exercise \n\nSets may be explored using (only) python. This is illustrated in the snippet below. \n\n<a href="#a314f450-54ad-4acd-bbf0-475e00ac5949" class ="btn btn-default Sagebutton" data-toggle="collapse"></a><div id=a314f450-54ad-4acd-bbf0-475e00ac5949 class = "collapse Sage envbuttons"><div class=sagepython><script type="text/x-sage">\nX = {1, 2, 3}\nY = {2, 3, 1}\nprint("X=Y is ", X==Y)\n\nS = {\'A\',\'n\',\'e\',\'x\',\'a\',\'m\',\'p\',\'l\',\'c\',\'o\',\'u\',\'d\',\'b\',\'t\',\'h\',\'s\',\'r\',\'i\'}\nprint("S = ", S) \nprint("The number of elements in S is |S|=", len(S))\n</script></div></div>\n\n\n\nExercise 1.44:\n\nCome up with three lines of Sage code that verifies $\\{1, 2, 3\\} \\neq \\{1, 2, 4\\}$. Try it out.\n\n/Exercise',
758
+ 'Chapter 1 on the language of mathematics is an introduction to the fundamental mathematics used in the notes.\nWithout understanding the basic concepts in it, you do not have the background to understand\nthe rest of the notes. Important highlights from the chapter are\n\n- Introduction to prompting. This is your ticket to using large language models effectively\n- How to use computer algebra (Sage). Sage can be very helpful in understanding the mathematics\n- Introduction of the numbers we use. Here the natural numbers, integers, rationals and real numbers are defined. Also the arithmetic rules for using them are given\n- Logic is the framework for reasoning in mathematics. Study this! First comes propositional logic. This is basic logic involving true and false statements with and, or etc as seen in truth tables. Then comes predicate logic, where variables are used. Here you must learn the meaning of "for every" and "there exists"\n- Proofs are described. Proof by contradiction is a must here! Do not skip it\n- The language of sets. Learn the operations on sets. Especially focus on the set builder notation and products of sets\n- Ordering of numbers. This is the formal definition of comparing numbers\n- Proof by induction. How to prove infinitely many propositions involving the natural numbers with one hack\n- The concept of a function. This is extremely important. Notice that a function is defined not by a rule. Also, in its definition enters crucially where it is defined\n- Functions from and into products\n- The preimage. This will become very important working with continuous functions',
759
+ ]
760
+ embeddings = model.encode(sentences)
761
+ print(embeddings.shape)
762
+ # [3, 384]
763
+
764
+ # Get the similarity scores for the embeddings
765
+ similarities = model.similarity(embeddings, embeddings)
766
+ print(similarities)
767
+ # tensor([[1.0000, 0.3078, 0.0796],
768
+ # [0.3078, 1.0000, 0.2794],
769
+ # [0.0796, 0.2794, 1.0000]])
770
+ ```
771
+
772
+ <!--
773
+ ### Direct Usage (Transformers)
774
+
775
+ <details><summary>Click to see the direct usage in Transformers</summary>
776
+
777
+ </details>
778
+ -->
779
+
780
+ <!--
781
+ ### Downstream Usage (Sentence Transformers)
782
+
783
+ You can finetune this model on your own dataset.
784
+
785
+ <details><summary>Click to expand</summary>
786
+
787
+ </details>
788
+ -->
789
+
790
+ <!--
791
+ ### Out-of-Scope Use
792
+
793
+ *List how the model may foreseeably be misused and address what users ought not to do with the model.*
794
+ -->
795
+
796
+ ## Evaluation
797
+
798
+ ### Metrics
799
+
800
+ #### Information Retrieval
801
+
802
+ * Evaluated with [<code>InformationRetrievalEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator) with these parameters:
803
+ ```json
804
+ {
805
+ "query_prompt": "query:",
806
+ "corpus_prompt": "passage:"
807
+ }
808
+ ```
809
+
810
+ | Metric | Value |
811
+ |:--------------------|:---------|
812
+ | cosine_accuracy@1 | 0.6908 |
813
+ | cosine_accuracy@3 | 0.8348 |
814
+ | cosine_accuracy@5 | 0.8881 |
815
+ | cosine_accuracy@10 | 0.9254 |
816
+ | cosine_precision@1 | 0.6908 |
817
+ | cosine_precision@3 | 0.2783 |
818
+ | cosine_precision@5 | 0.1776 |
819
+ | cosine_precision@10 | 0.0925 |
820
+ | cosine_recall@1 | 0.6908 |
821
+ | cosine_recall@3 | 0.8348 |
822
+ | cosine_recall@5 | 0.8881 |
823
+ | cosine_recall@10 | 0.9254 |
824
+ | cosine_ndcg@3 | 0.7763 |
825
+ | cosine_ndcg@5 | 0.798 |
826
+ | **cosine_ndcg@10** | **0.81** |
827
+ | cosine_mrr@3 | 0.756 |
828
+ | cosine_mrr@5 | 0.7679 |
829
+ | cosine_mrr@10 | 0.7728 |
830
+ | cosine_map@100 | 0.7764 |
831
+
832
+ #### Information Retrieval
833
+
834
+ * Evaluated with [<code>InformationRetrievalEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator) with these parameters:
835
+ ```json
836
+ {
837
+ "query_prompt": "query:",
838
+ "corpus_prompt": "passage:"
839
+ }
840
+ ```
841
+
842
+ | Metric | Value |
843
+ |:--------------------|:-----------|
844
+ | cosine_accuracy@1 | 0.6663 |
845
+ | cosine_accuracy@3 | 0.8116 |
846
+ | cosine_accuracy@5 | 0.8698 |
847
+ | cosine_accuracy@10 | 0.9117 |
848
+ | cosine_precision@1 | 0.6663 |
849
+ | cosine_precision@3 | 0.2705 |
850
+ | cosine_precision@5 | 0.174 |
851
+ | cosine_precision@10 | 0.0912 |
852
+ | cosine_recall@1 | 0.6663 |
853
+ | cosine_recall@3 | 0.8116 |
854
+ | cosine_recall@5 | 0.8698 |
855
+ | cosine_recall@10 | 0.9117 |
856
+ | cosine_ndcg@3 | 0.7519 |
857
+ | cosine_ndcg@5 | 0.7762 |
858
+ | **cosine_ndcg@10** | **0.7897** |
859
+ | cosine_mrr@3 | 0.7313 |
860
+ | cosine_mrr@5 | 0.7449 |
861
+ | cosine_mrr@10 | 0.7504 |
862
+ | cosine_map@100 | 0.7542 |
863
+
864
+ <!--
865
+ ## Bias, Risks and Limitations
866
+
867
+ *What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
868
+ -->
869
+
870
+ <!--
871
+ ### Recommendations
872
+
873
+ *What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
874
+ -->
875
+
876
+ ## Training Details
877
+
878
+ ### Training Dataset
879
+
880
+ #### Unnamed Dataset
881
+
882
+ * Size: 2,778 training samples
883
+ * Columns: <code>anchor</code> and <code>positive</code>
884
+ * Approximate statistics based on the first 1000 samples:
885
+ | | anchor | positive |
886
+ |:--------|:------------------------------------------------------------------------------------|:-------------------------------------------------------------------------------------|
887
+ | type | string | string |
888
+ | details | <ul><li>min: 14 tokens</li><li>mean: 41.25 tokens</li><li>max: 125 tokens</li></ul> | <ul><li>min: 37 tokens</li><li>mean: 351.42 tokens</li><li>max: 512 tokens</li></ul> |
889
+ * Samples:
890
+ | anchor | positive |
891
+ |:-------------------------------------------------------------------------------------------------------------------------------------------------------------|:----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
892
+ | <code>In Definition 8.2, why is the Hessian matrix defined with second partial derivatives evaluated at the point \(v\)?</code> | <code>Definition 8.2:<br><br><br>The *Hessian matrix* of $F$ at the point<br>$v\in \mathbb{R}^n$ is defined by<br><br>$$<br> \nabla^2 F(v) :=<br> \begin{pmatrix}<br> \dfrac{ \partial^2 F}{ \partial x_1 \partial x_1}(v) &<br> \cdots & \dfrac{ \partial^2 F}{ \partial x_1 \partial<br> x_n}(v)<br> \\<br> \vdots & \ddots & \vdots<br> \\<br> \dfrac{ \partial^2 F}{ \partial x_n \partial x_1}(v) &<br> \cdots & \dfrac{\partial^2 F}{ \partial x_n\partial<br> x_n}(v)<br> \end{pmatrix}<br> .<br><br>$$<br><br>/Definition<br><br>A very important observation is that $\nabla^2 F(v)$ above is a<br>symmetric matrix if $F$ satisfies the condition in the last part of Theorem 7.13.</code> |
893
+ | <code>The definition shows the entry \(\frac{\partial^2 F}{\partial x_i \partial x_j}(v)\). Does the order of differentiation matter for the Hessian?</code> | <code>Definition 8.2:<br><br><br>The *Hessian matrix* of $F$ at the point<br>$v\in \mathbb{R}^n$ is defined by<br><br>$$<br> \nabla^2 F(v) :=<br> \begin{pmatrix}<br> \dfrac{ \partial^2 F}{ \partial x_1 \partial x_1}(v) &<br> \cdots & \dfrac{ \partial^2 F}{ \partial x_1 \partial<br> x_n}(v)<br> \\<br> \vdots & \ddots & \vdots<br> \\<br> \dfrac{ \partial^2 F}{ \partial x_n \partial x_1}(v) &<br> \cdots & \dfrac{\partial^2 F}{ \partial x_n\partial<br> x_n}(v)<br> \end{pmatrix}<br> .<br><br>$$<br><br>/Definition<br><br>A very important observation is that $\nabla^2 F(v)$ above is a<br>symmetric matrix if $F$ satisfies the condition in the last part of Theorem 7.13.</code> |
894
+ | <code>The text says the Hessian is symmetric if \(F\) satisfies the condition in the last part of Theorem 7.13. What is that condition exactly?</code> | <code>Definition 8.2:<br><br><br>The *Hessian matrix* of $F$ at the point<br>$v\in \mathbb{R}^n$ is defined by<br><br>$$<br> \nabla^2 F(v) :=<br> \begin{pmatrix}<br> \dfrac{ \partial^2 F}{ \partial x_1 \partial x_1}(v) &<br> \cdots & \dfrac{ \partial^2 F}{ \partial x_1 \partial<br> x_n}(v)<br> \\<br> \vdots & \ddots & \vdots<br> \\<br> \dfrac{ \partial^2 F}{ \partial x_n \partial x_1}(v) &<br> \cdots & \dfrac{\partial^2 F}{ \partial x_n\partial<br> x_n}(v)<br> \end{pmatrix}<br> .<br><br>$$<br><br>/Definition<br><br>A very important observation is that $\nabla^2 F(v)$ above is a<br>symmetric matrix if $F$ satisfies the condition in the last part of Theorem 7.13.</code> |
895
+ * Loss: [<code>MultipleNegativesRankingLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#multiplenegativesrankingloss) with these parameters:
896
+ ```json
897
+ {
898
+ "scale": 20.0,
899
+ "similarity_fct": "cos_sim",
900
+ "gather_across_devices": false
901
+ }
902
+ ```
903
+
904
+ ### Evaluation Dataset
905
+
906
+ #### Unnamed Dataset
907
+
908
+ * Size: 929 evaluation samples
909
+ * Columns: <code>anchor</code> and <code>positive</code>
910
+ * Approximate statistics based on the first 929 samples:
911
+ | | anchor | positive |
912
+ |:--------|:----------------------------------------------------------------------------------|:-------------------------------------------------------------------------------------|
913
+ | type | string | string |
914
+ | details | <ul><li>min: 6 tokens</li><li>mean: 29.95 tokens</li><li>max: 96 tokens</li></ul> | <ul><li>min: 36 tokens</li><li>mean: 383.98 tokens</li><li>max: 512 tokens</li></ul> |
915
+ * Samples:
916
+ | anchor | positive |
917
+ |:---------------------------------------------------------------------------------------------------------------------------------------------------------------------|:------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
918
+ | <code>In Section 1.1, why does the author warn that prompting without any knowledge of the mathematics can be disastrous?</code> | <code>Chapter 1: The language of mathematics and prompting<br><br><br><br>Section 1.1: The art of prompting<br><br><br><br>As of August 2024, there is a multitude of chatbots available on the internet. Some of them, like<br>ChatGPT: https://chatgpt.com, Claude: https://claude.ai and Gemini: https://gemini.google.com (and Llama 3.1, Mistral, ... the list goes on)<br><br><br><br>have quite impressive reasoning capabilities. <br>These models are now multimodal i.e., they <br>even accept non-textual input, such as images, sound and video. In principle you can upload a picture of a math exercise and<br>the chatbot will provide a solution. Well, that is, on a good day and for a not too difficult exercise.<br><br>The use of chatbots is encouraged throughout this course. In fact,<br>they are even allowed during the exam. It is my hope that you will<br>learn mathematics on a deeper level by communicating with the machine<br>using carefully designed prompts - see <br>the OpenAI guide: https://platform.openai.com/docs/guides/prompt-engineering on prompt engineering.<br>...</code> |
919
+ | <code>The first prompting block asks for "two examples of good prompts"—how should I include LaTeX code in such a prompt according to the example?</code> | <code>Chapter 1: The language of mathematics and prompting<br><br><br><br>Section 1.1: The art of prompting<br><br><br><br>As of August 2024, there is a multitude of chatbots available on the internet. Some of them, like<br>ChatGPT: https://chatgpt.com, Claude: https://claude.ai and Gemini: https://gemini.google.com (and Llama 3.1, Mistral, ... the list goes on)<br><br><br><br>have quite impressive reasoning capabilities. <br>These models are now multimodal i.e., they <br>even accept non-textual input, such as images, sound and video. In principle you can upload a picture of a math exercise and<br>the chatbot will provide a solution. Well, that is, on a good day and for a not too difficult exercise.<br><br>The use of chatbots is encouraged throughout this course. In fact,<br>they are even allowed during the exam. It is my hope that you will<br>learn mathematics on a deeper level by communicating with the machine<br>using carefully designed prompts - see <br>the OpenAI guide: https://platform.openai.com/docs/guides/prompt-engineering on prompt engineering.<br>...</code> |
920
+ | <code>In the second prompting block, the equation $x^2 - x - 1 = 0$ is given; what level of detail does "Guide me through the steps" expect from the chatbot?</code> | <code>Chapter 1: The language of mathematics and prompting<br><br><br><br>Section 1.1: The art of prompting<br><br><br><br>As of August 2024, there is a multitude of chatbots available on the internet. Some of them, like<br>ChatGPT: https://chatgpt.com, Claude: https://claude.ai and Gemini: https://gemini.google.com (and Llama 3.1, Mistral, ... the list goes on)<br><br><br><br>have quite impressive reasoning capabilities. <br>These models are now multimodal i.e., they <br>even accept non-textual input, such as images, sound and video. In principle you can upload a picture of a math exercise and<br>the chatbot will provide a solution. Well, that is, on a good day and for a not too difficult exercise.<br><br>The use of chatbots is encouraged throughout this course. In fact,<br>they are even allowed during the exam. It is my hope that you will<br>learn mathematics on a deeper level by communicating with the machine<br>using carefully designed prompts - see <br>the OpenAI guide: https://platform.openai.com/docs/guides/prompt-engineering on prompt engineering.<br>...</code> |
921
+ * Loss: [<code>MultipleNegativesRankingLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#multiplenegativesrankingloss) with these parameters:
922
+ ```json
923
+ {
924
+ "scale": 20.0,
925
+ "similarity_fct": "cos_sim",
926
+ "gather_across_devices": false
927
+ }
928
+ ```
929
+
930
+ ### Training Hyperparameters
931
+ #### Non-Default Hyperparameters
932
+
933
+ - `eval_strategy`: steps
934
+ - `per_device_train_batch_size`: 32
935
+ - `per_device_eval_batch_size`: 32
936
+ - `learning_rate`: 2e-05
937
+ - `num_train_epochs`: 8
938
+ - `warmup_ratio`: 0.1
939
+ - `fp16`: True
940
+ - `load_best_model_at_end`: True
941
+ - `prompts`: {'anchor': 'query:', 'positive': 'passage:', 'negative': 'passage:'}
942
+ - `batch_sampler`: no_duplicates
943
+
944
+ #### All Hyperparameters
945
+ <details><summary>Click to expand</summary>
946
+
947
+ - `overwrite_output_dir`: False
948
+ - `do_predict`: False
949
+ - `eval_strategy`: steps
950
+ - `prediction_loss_only`: True
951
+ - `per_device_train_batch_size`: 32
952
+ - `per_device_eval_batch_size`: 32
953
+ - `per_gpu_train_batch_size`: None
954
+ - `per_gpu_eval_batch_size`: None
955
+ - `gradient_accumulation_steps`: 1
956
+ - `eval_accumulation_steps`: None
957
+ - `torch_empty_cache_steps`: None
958
+ - `learning_rate`: 2e-05
959
+ - `weight_decay`: 0.0
960
+ - `adam_beta1`: 0.9
961
+ - `adam_beta2`: 0.999
962
+ - `adam_epsilon`: 1e-08
963
+ - `max_grad_norm`: 1.0
964
+ - `num_train_epochs`: 8
965
+ - `max_steps`: -1
966
+ - `lr_scheduler_type`: linear
967
+ - `lr_scheduler_kwargs`: {}
968
+ - `warmup_ratio`: 0.1
969
+ - `warmup_steps`: 0
970
+ - `log_level`: passive
971
+ - `log_level_replica`: warning
972
+ - `log_on_each_node`: True
973
+ - `logging_nan_inf_filter`: True
974
+ - `save_safetensors`: True
975
+ - `save_on_each_node`: False
976
+ - `save_only_model`: False
977
+ - `restore_callback_states_from_checkpoint`: False
978
+ - `no_cuda`: False
979
+ - `use_cpu`: False
980
+ - `use_mps_device`: False
981
+ - `seed`: 42
982
+ - `data_seed`: None
983
+ - `jit_mode_eval`: False
984
+ - `bf16`: False
985
+ - `fp16`: True
986
+ - `fp16_opt_level`: O1
987
+ - `half_precision_backend`: auto
988
+ - `bf16_full_eval`: False
989
+ - `fp16_full_eval`: False
990
+ - `tf32`: None
991
+ - `local_rank`: 0
992
+ - `ddp_backend`: None
993
+ - `tpu_num_cores`: None
994
+ - `tpu_metrics_debug`: False
995
+ - `debug`: []
996
+ - `dataloader_drop_last`: False
997
+ - `dataloader_num_workers`: 0
998
+ - `dataloader_prefetch_factor`: None
999
+ - `past_index`: -1
1000
+ - `disable_tqdm`: False
1001
+ - `remove_unused_columns`: True
1002
+ - `label_names`: None
1003
+ - `load_best_model_at_end`: True
1004
+ - `ignore_data_skip`: False
1005
+ - `fsdp`: []
1006
+ - `fsdp_min_num_params`: 0
1007
+ - `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
1008
+ - `fsdp_transformer_layer_cls_to_wrap`: None
1009
+ - `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
1010
+ - `parallelism_config`: None
1011
+ - `deepspeed`: None
1012
+ - `label_smoothing_factor`: 0.0
1013
+ - `optim`: adamw_torch_fused
1014
+ - `optim_args`: None
1015
+ - `adafactor`: False
1016
+ - `group_by_length`: False
1017
+ - `length_column_name`: length
1018
+ - `project`: huggingface
1019
+ - `trackio_space_id`: trackio
1020
+ - `ddp_find_unused_parameters`: None
1021
+ - `ddp_bucket_cap_mb`: None
1022
+ - `ddp_broadcast_buffers`: False
1023
+ - `dataloader_pin_memory`: True
1024
+ - `dataloader_persistent_workers`: False
1025
+ - `skip_memory_metrics`: True
1026
+ - `use_legacy_prediction_loop`: False
1027
+ - `push_to_hub`: False
1028
+ - `resume_from_checkpoint`: None
1029
+ - `hub_model_id`: None
1030
+ - `hub_strategy`: every_save
1031
+ - `hub_private_repo`: None
1032
+ - `hub_always_push`: False
1033
+ - `hub_revision`: None
1034
+ - `gradient_checkpointing`: False
1035
+ - `gradient_checkpointing_kwargs`: None
1036
+ - `include_inputs_for_metrics`: False
1037
+ - `include_for_metrics`: []
1038
+ - `eval_do_concat_batches`: True
1039
+ - `fp16_backend`: auto
1040
+ - `push_to_hub_model_id`: None
1041
+ - `push_to_hub_organization`: None
1042
+ - `mp_parameters`:
1043
+ - `auto_find_batch_size`: False
1044
+ - `full_determinism`: False
1045
+ - `torchdynamo`: None
1046
+ - `ray_scope`: last
1047
+ - `ddp_timeout`: 1800
1048
+ - `torch_compile`: False
1049
+ - `torch_compile_backend`: None
1050
+ - `torch_compile_mode`: None
1051
+ - `include_tokens_per_second`: False
1052
+ - `include_num_input_tokens_seen`: no
1053
+ - `neftune_noise_alpha`: None
1054
+ - `optim_target_modules`: None
1055
+ - `batch_eval_metrics`: False
1056
+ - `eval_on_start`: False
1057
+ - `use_liger_kernel`: False
1058
+ - `liger_kernel_config`: None
1059
+ - `eval_use_gather_object`: False
1060
+ - `average_tokens_across_devices`: True
1061
+ - `prompts`: {'anchor': 'query:', 'positive': 'passage:', 'negative': 'passage:'}
1062
+ - `batch_sampler`: no_duplicates
1063
+ - `multi_dataset_batch_sampler`: proportional
1064
+ - `router_mapping`: {}
1065
+ - `learning_rate_mapping`: {}
1066
+
1067
+ </details>
1068
+
1069
+ ### Training Logs
1070
+ | Epoch | Step | Training Loss | Validation Loss | cosine_ndcg@10 |
1071
+ |:----------:|:-------:|:-------------:|:---------------:|:--------------:|
1072
+ | -1 | -1 | - | - | 0.4709 |
1073
+ | 1.1494 | 100 | 1.2817 | 0.7786 | 0.7818 |
1074
+ | 2.2989 | 200 | 0.3207 | 0.7569 | 0.7762 |
1075
+ | 3.4483 | 300 | 0.2454 | 0.7324 | 0.7823 |
1076
+ | **4.5977** | **400** | **0.1875** | **0.7012** | **0.7948** |
1077
+ | 5.7471 | 500 | 0.1479 | 0.7016 | 0.7897 |
1078
+ | 6.8966 | 600 | 0.1325 | 0.6992 | 0.7897 |
1079
+ | -1 | -1 | - | - | 0.8100 |
1080
+
1081
+ * The bold row denotes the saved checkpoint.
1082
+
1083
+ ### Framework Versions
1084
+ - Python: 3.12.12
1085
+ - Sentence Transformers: 5.1.2
1086
+ - Transformers: 4.57.1
1087
+ - PyTorch: 2.8.0+cu126
1088
+ - Accelerate: 1.11.0
1089
+ - Datasets: 4.0.0
1090
+ - Tokenizers: 0.22.1
1091
+
1092
+ ## Citation
1093
+
1094
+ ### BibTeX
1095
+
1096
+ #### Sentence Transformers
1097
+ ```bibtex
1098
+ @inproceedings{reimers-2019-sentence-bert,
1099
+ title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
1100
+ author = "Reimers, Nils and Gurevych, Iryna",
1101
+ booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
1102
+ month = "11",
1103
+ year = "2019",
1104
+ publisher = "Association for Computational Linguistics",
1105
+ url = "https://arxiv.org/abs/1908.10084",
1106
+ }
1107
+ ```
1108
+
1109
+ #### MultipleNegativesRankingLoss
1110
+ ```bibtex
1111
+ @misc{henderson2017efficient,
1112
+ title={Efficient Natural Language Response Suggestion for Smart Reply},
1113
+ author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
1114
+ year={2017},
1115
+ eprint={1705.00652},
1116
+ archivePrefix={arXiv},
1117
+ primaryClass={cs.CL}
1118
+ }
1119
+ ```
1120
+
1121
+ <!--
1122
+ ## Glossary
1123
+
1124
+ *Clearly define terms in order to be accessible across audiences.*
1125
+ -->
1126
+
1127
+ <!--
1128
+ ## Model Card Authors
1129
+
1130
+ *Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
1131
+ -->
1132
+
1133
+ <!--
1134
+ ## Model Card Contact
1135
+
1136
+ *Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
1137
+ -->
config.json ADDED
@@ -0,0 +1,24 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "architectures": [
3
+ "BertModel"
4
+ ],
5
+ "attention_probs_dropout_prob": 0.1,
6
+ "classifier_dropout": null,
7
+ "dtype": "float32",
8
+ "hidden_act": "gelu",
9
+ "hidden_dropout_prob": 0.1,
10
+ "hidden_size": 384,
11
+ "initializer_range": 0.02,
12
+ "intermediate_size": 1536,
13
+ "layer_norm_eps": 1e-12,
14
+ "max_position_embeddings": 512,
15
+ "model_type": "bert",
16
+ "num_attention_heads": 12,
17
+ "num_hidden_layers": 12,
18
+ "pad_token_id": 0,
19
+ "position_embedding_type": "absolute",
20
+ "transformers_version": "4.57.1",
21
+ "type_vocab_size": 2,
22
+ "use_cache": true,
23
+ "vocab_size": 30522
24
+ }
config_sentence_transformers.json ADDED
@@ -0,0 +1,14 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "model_type": "SentenceTransformer",
3
+ "__version__": {
4
+ "sentence_transformers": "5.1.2",
5
+ "transformers": "4.57.1",
6
+ "pytorch": "2.8.0+cu126"
7
+ },
8
+ "prompts": {
9
+ "query": "",
10
+ "document": ""
11
+ },
12
+ "default_prompt_name": null,
13
+ "similarity_fn_name": "cosine"
14
+ }
model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:33349852bb867909c7611815a0fca713f0cb10b20516eb1164474ae519b30fd3
3
+ size 133462128
modules.json ADDED
@@ -0,0 +1,20 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ [
2
+ {
3
+ "idx": 0,
4
+ "name": "0",
5
+ "path": "",
6
+ "type": "sentence_transformers.models.Transformer"
7
+ },
8
+ {
9
+ "idx": 1,
10
+ "name": "1",
11
+ "path": "1_Pooling",
12
+ "type": "sentence_transformers.models.Pooling"
13
+ },
14
+ {
15
+ "idx": 2,
16
+ "name": "2",
17
+ "path": "2_Normalize",
18
+ "type": "sentence_transformers.models.Normalize"
19
+ }
20
+ ]
sentence_bert_config.json ADDED
@@ -0,0 +1,4 @@
 
 
 
 
 
1
+ {
2
+ "max_seq_length": 512,
3
+ "do_lower_case": false
4
+ }
special_tokens_map.json ADDED
@@ -0,0 +1,37 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "cls_token": {
3
+ "content": "[CLS]",
4
+ "lstrip": false,
5
+ "normalized": false,
6
+ "rstrip": false,
7
+ "single_word": false
8
+ },
9
+ "mask_token": {
10
+ "content": "[MASK]",
11
+ "lstrip": false,
12
+ "normalized": false,
13
+ "rstrip": false,
14
+ "single_word": false
15
+ },
16
+ "pad_token": {
17
+ "content": "[PAD]",
18
+ "lstrip": false,
19
+ "normalized": false,
20
+ "rstrip": false,
21
+ "single_word": false
22
+ },
23
+ "sep_token": {
24
+ "content": "[SEP]",
25
+ "lstrip": false,
26
+ "normalized": false,
27
+ "rstrip": false,
28
+ "single_word": false
29
+ },
30
+ "unk_token": {
31
+ "content": "[UNK]",
32
+ "lstrip": false,
33
+ "normalized": false,
34
+ "rstrip": false,
35
+ "single_word": false
36
+ }
37
+ }
tokenizer.json ADDED
The diff for this file is too large to render. See raw diff
 
tokenizer_config.json ADDED
@@ -0,0 +1,56 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "added_tokens_decoder": {
3
+ "0": {
4
+ "content": "[PAD]",
5
+ "lstrip": false,
6
+ "normalized": false,
7
+ "rstrip": false,
8
+ "single_word": false,
9
+ "special": true
10
+ },
11
+ "100": {
12
+ "content": "[UNK]",
13
+ "lstrip": false,
14
+ "normalized": false,
15
+ "rstrip": false,
16
+ "single_word": false,
17
+ "special": true
18
+ },
19
+ "101": {
20
+ "content": "[CLS]",
21
+ "lstrip": false,
22
+ "normalized": false,
23
+ "rstrip": false,
24
+ "single_word": false,
25
+ "special": true
26
+ },
27
+ "102": {
28
+ "content": "[SEP]",
29
+ "lstrip": false,
30
+ "normalized": false,
31
+ "rstrip": false,
32
+ "single_word": false,
33
+ "special": true
34
+ },
35
+ "103": {
36
+ "content": "[MASK]",
37
+ "lstrip": false,
38
+ "normalized": false,
39
+ "rstrip": false,
40
+ "single_word": false,
41
+ "special": true
42
+ }
43
+ },
44
+ "clean_up_tokenization_spaces": true,
45
+ "cls_token": "[CLS]",
46
+ "do_lower_case": true,
47
+ "extra_special_tokens": {},
48
+ "mask_token": "[MASK]",
49
+ "model_max_length": 512,
50
+ "pad_token": "[PAD]",
51
+ "sep_token": "[SEP]",
52
+ "strip_accents": null,
53
+ "tokenize_chinese_chars": true,
54
+ "tokenizer_class": "BertTokenizer",
55
+ "unk_token": "[UNK]"
56
+ }
vocab.txt ADDED
The diff for this file is too large to render. See raw diff