Update README.md
Browse files
README.md
CHANGED
|
@@ -11,8 +11,9 @@ tags:
|
|
| 11 |
# Description
|
| 12 |
We use MS Marco Encoder msmarco-MiniLM-L-6-v3 to encode the text from dataset [abokbot/wikipedia-first-paragraph](https://huggingface.co/datasets/abokbot/wikipedia-first-paragraph).
|
| 13 |
|
| 14 |
-
|
| 15 |
|
|
|
|
| 16 |
|
| 17 |
# Code
|
| 18 |
It was obtained by running the following code.
|
|
|
|
| 11 |
# Description
|
| 12 |
We use MS Marco Encoder msmarco-MiniLM-L-6-v3 to encode the text from dataset [abokbot/wikipedia-first-paragraph](https://huggingface.co/datasets/abokbot/wikipedia-first-paragraph).
|
| 13 |
|
| 14 |
+
The dataset contains the first paragraphs of the English "20220301.en" version of the [Wikipedia dataset](https://huggingface.co/datasets/wikipedia).
|
| 15 |
|
| 16 |
+
The output is an embedding tensor of size [6458670, 384].
|
| 17 |
|
| 18 |
# Code
|
| 19 |
It was obtained by running the following code.
|