sdadas commited on
Commit
c2b7542
·
verified ·
1 Parent(s): 1682e4c

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -19,7 +19,7 @@ widget:
19
  <h1 align="center">Stella-PL-retrieval-mini-8k</h1>
20
 
21
  This is an embedding model based on [stella_en_400M_v5](https://huggingface.co/NovaSearch/stella_en_400M_v5) and further fine-tuned for retrieval tasks in Polish. It transforms texts into 1024-dimensional vectors. The model training consisted of two stages:
22
- - In the first stage, we adapted the model to support the Polish language using the [multilingual knowledge distillation method](https://aclanthology.org/2020.emnlp-main.365/) method, leveraging a diverse corpus of 20 million Polish-English text pairs.
23
  - The original Stella model and the output of the first stage were limited to a short context of 512 tokens. In the second stage, we extended the context to 8192 tokens and then fine-tuned the model using contrastive loss on a dataset comprising 1.5 million queries. Positive and negative passages for each query have been selected with the help of [BAAI/bge-reranker-v2.5-gemma2-lightweight](https://huggingface.co/BAAI/bge-reranker-v2.5-gemma2-lightweight) reranker. The model was trained for five epochs with a batch size of 1024 queries.
24
 
25
  <table>
 
19
  <h1 align="center">Stella-PL-retrieval-mini-8k</h1>
20
 
21
  This is an embedding model based on [stella_en_400M_v5](https://huggingface.co/NovaSearch/stella_en_400M_v5) and further fine-tuned for retrieval tasks in Polish. It transforms texts into 1024-dimensional vectors. The model training consisted of two stages:
22
+ - In the first stage, we adapted the model to support the Polish language using the [multilingual knowledge distillation](https://aclanthology.org/2020.emnlp-main.365/) method, leveraging a diverse corpus of 20 million Polish-English text pairs.
23
  - The original Stella model and the output of the first stage were limited to a short context of 512 tokens. In the second stage, we extended the context to 8192 tokens and then fine-tuned the model using contrastive loss on a dataset comprising 1.5 million queries. Positive and negative passages for each query have been selected with the help of [BAAI/bge-reranker-v2.5-gemma2-lightweight](https://huggingface.co/BAAI/bge-reranker-v2.5-gemma2-lightweight) reranker. The model was trained for five epochs with a batch size of 1024 queries.
24
 
25
  <table>