rvo commited on
Commit
19a045c
·
verified ·
1 Parent(s): 46ce54b

Upload README.md

Browse files
Files changed (1) hide show
  1. README.md +67 -0
README.md CHANGED
@@ -112,6 +112,73 @@ for i, query in enumerate(queries):
112
  # Similarity: 0.4238 | Document 0: Machine learning is a subset of ...
113
  # Similarity: 0.5723 | Document 1: Neural networks are trained ...
114
  ```
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
115
 
116
  ## Transformers Usage
117
 
 
112
  # Similarity: 0.4238 | Document 0: Machine learning is a subset of ...
113
  # Similarity: 0.5723 | Document 1: Neural networks are trained ...
114
  ```
115
+
116
+ ## Transformers.js Usage
117
+
118
+ If you haven't already, you can install the [Transformers.js](https://huggingface.co/docs/transformers.js) JavaScript library from [NPM](https://www.npmjs.com/package/@huggingface/transformers) using:
119
+ ```bash
120
+ npm i @huggingface/transformers
121
+ ```
122
+
123
+ You can then use the model to compute embeddings like this:
124
+
125
+ ```js
126
+ import { AutoModel, AutoTokenizer, matmul } from "@huggingface/transformers";
127
+
128
+ // Download from the 🤗 Hub
129
+ const model_id = "onnx-community/mdbr-leaf-ir-ONNX";
130
+ const tokenizer = await AutoTokenizer.from_pretrained(model_id);
131
+ const model = await AutoModel.from_pretrained(model_id, {
132
+ dtype: "fp32", // Options: "fp32" | "q8" | "q4"
133
+ });
134
+
135
+ // Prepare queries and documents
136
+ const queries = [
137
+ "What is machine learning?",
138
+ "How does neural network training work?",
139
+ ];
140
+ const documents = [
141
+ "Machine learning is a subset of artificial intelligence that focuses on algorithms that can learn from data.",
142
+ "Neural networks are trained through backpropagation, adjusting weights to minimize prediction errors.",
143
+ ];
144
+ const inputs = await tokenizer([
145
+ ...queries.map((x) => "Represent this sentence for searching relevant passages: " + x),
146
+ ...documents,
147
+ ], { padding: true });
148
+
149
+ // Generate embeddings
150
+ const { sentence_embedding } = await model(inputs);
151
+
152
+ // Compute similarities
153
+ const scores = await matmul(
154
+ sentence_embedding.slice([0, queries.length]),
155
+ sentence_embedding.slice([queries.length, null]).transpose(1, 0),
156
+ );
157
+ const scores_list = scores.tolist();
158
+
159
+ for (let i = 0; i < queries.length; ++i) {
160
+ console.log(`Query: ${queries[i]}`);
161
+ for (let j = 0; j < documents.length; ++j) {
162
+ console.log(` Similarity: ${scores_list[i][j].toFixed(4)} | Document ${j}: ${documents[j]}`);
163
+ }
164
+ console.log();
165
+ }
166
+ ```
167
+
168
+ <details>
169
+
170
+ <summary>See example output</summary>
171
+
172
+ ```
173
+ Query: What is machine learning?
174
+ Similarity: 0.6857 | Document 0: Machine learning is a subset of artificial intelligence that focuses on algorithms that can learn from data.
175
+ Similarity: 0.4598 | Document 1: Neural networks are trained through backpropagation, adjusting weights to minimize prediction errors.
176
+
177
+ Query: How does neural network training work?
178
+ Similarity: 0.4238 | Document 0: Machine learning is a subset of artificial intelligence that focuses on algorithms that can learn from data.
179
+ Similarity: 0.5723 | Document 1: Neural networks are trained through backpropagation, adjusting weights to minimize prediction errors.
180
+ ```
181
+ </details>
182
 
183
  ## Transformers Usage
184