FractalGPT
/

SbertDistil

Feature Extraction

sentence-transformers

sentence-similarity

text-embeddings-inference

Model card Files Files and versions

Ponimash commited on Jan 11, 2024

Commit

94b95f8

·

verified ·

1 Parent(s): cd9efa3

Update README.md

Files changed (1) hide show

README.md +20 -14

README.md CHANGED Viewed

@@ -4,12 +4,22 @@ tags:
 - sentence-transformers
 - feature-extraction
 - sentence-similarity
 ---
 # FractalGPT/SberDistil
 This is a [sentence-transformers](https://www.SBERT.net) model: It maps sentences & paragraphs to a 384 dimensional dense vector space and can be used for tasks like clustering or semantic search.
 <!--- Describe your model here -->
@@ -32,15 +42,15 @@ embeddings = model.encode(sentences)
 print(embeddings)
 ```
-## Evaluation Results
-<!--- Describe how your model was evaluated -->
-For an automated evaluation of this model, see the *Sentence Embeddings Benchmark*: [https://seb.sbert.net](https://seb.sbert.net?model_name=FractalGPT/SberDistil)
 ## Full Model Architecture
 ```
@@ -49,8 +59,4 @@ SentenceTransformer(
   (1): Pooling({'word_embedding_dimension': 312, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False})
   (2): Dense({'in_features': 312, 'out_features': 384, 'bias': True, 'activation_function': 'torch.nn.modules.linear.Identity'})
 )
-```
-## Citing & Authors
-<!--- Describe where people can find more information -->

 - sentence-transformers
 - feature-extraction
 - sentence-similarity
+license: apache-2.0
+datasets:
+- wikimedia/wikipedia
+- SiberiaSoft/SiberianPersonaChat-2
+language:
+- ru
+- en
+metrics:
+- mse
+library_name: transformers
 ---
 # FractalGPT/SberDistil
 This is a [sentence-transformers](https://www.SBERT.net) model: It maps sentences & paragraphs to a 384 dimensional dense vector space and can be used for tasks like clustering or semantic search.
+This is a fast and small model for solving the problem of determining the proximity between sentences, in the future we will reduce and speed it up. [Project](https://github.com/FractalGPT/ModelEmbedderDistilation)
 <!--- Describe your model here -->
 print(embeddings)
 ```
+## Training
+* The original weights was taken from [cointegrated/rubert-tiny2](https://huggingface.co/cointegrated/rubert-tiny2).
+*
+* Training was conducted in two stages:
+1. In the first stage, the model was trained on Wikipedia texts (4 million texts) for three epochs.
+   <img src="https://github.com/FractalGPT/ModelEmbedderDistilation/blob/main/DistilSBERT/Train/1_st_en.JPG?raw=true" width=700 />
+3. In the second stage, training was conducted on Wikipedia, a dialog dataset, and NLI for one epoch.
+   <img src="https://github.com/FractalGPT/ModelEmbedderDistilation/blob/main/DistilSBERT/Train/2_st_en.JPG?raw=true" width=700 />
 ## Full Model Architecture
 ```
   (1): Pooling({'word_embedding_dimension': 312, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False})
   (2): Dense({'in_features': 312, 'out_features': 384, 'bias': True, 'activation_function': 'torch.nn.modules.linear.Identity'})
 )
+```