Update README.md
Browse files
README.md
CHANGED
|
@@ -20,7 +20,7 @@ library_name: transformers
|
|
| 20 |
|
| 21 |
|
| 22 |
This is a [sentence-transformers](https://www.SBERT.net) model: It maps sentences & paragraphs to a 384 dimensional dense vector space and can be used for tasks like clustering or semantic search.
|
| 23 |
-
This is a fast and small model for solving the problem of determining the proximity between sentences, in the future we will reduce and speed it up. [Project](https://github.com/FractalGPT/
|
| 24 |
|
| 25 |
<!--- Describe your model here -->
|
| 26 |
|
|
@@ -66,9 +66,9 @@ cos(a, b)
|
|
| 66 |
* The original weights was taken from [cointegrated/rubert-tiny2](https://huggingface.co/cointegrated/rubert-tiny2).
|
| 67 |
* Training was conducted in two stages:
|
| 68 |
1. In the first stage, the model was trained on Wikipedia texts (4 million texts) for three epochs.
|
| 69 |
-
<img src="https://github.com/FractalGPT/
|
| 70 |
3. In the second stage, training was conducted on Wikipedia and dialog dataset for one epoch.
|
| 71 |
-
<img src="https://github.com/FractalGPT/
|
| 72 |
|
| 73 |
## Full Model Architecture
|
| 74 |
```
|
|
|
|
| 20 |
|
| 21 |
|
| 22 |
This is a [sentence-transformers](https://www.SBERT.net) model: It maps sentences & paragraphs to a 384 dimensional dense vector space and can be used for tasks like clustering or semantic search.
|
| 23 |
+
This is a fast and small model for solving the problem of determining the proximity between sentences, in the future we will reduce and speed it up. [Project](https://github.com/FractalGPT/ModelEmbedderDistillation)
|
| 24 |
|
| 25 |
<!--- Describe your model here -->
|
| 26 |
|
|
|
|
| 66 |
* The original weights was taken from [cointegrated/rubert-tiny2](https://huggingface.co/cointegrated/rubert-tiny2).
|
| 67 |
* Training was conducted in two stages:
|
| 68 |
1. In the first stage, the model was trained on Wikipedia texts (4 million texts) for three epochs.
|
| 69 |
+
<img src="https://github.com/FractalGPT/ModelEmbedderDistillation/blob/main/DistilSBERT/Train/1_st_en.JPG?raw=true" width=700 />
|
| 70 |
3. In the second stage, training was conducted on Wikipedia and dialog dataset for one epoch.
|
| 71 |
+
<img src="https://github.com/FractalGPT/ModelEmbedderDistillation/blob/main/DistilSBERT/Train/2_st_en.JPG?raw=true" width=700 />
|
| 72 |
|
| 73 |
## Full Model Architecture
|
| 74 |
```
|