sdadas
/

stella-pl-retrieval-mini-8k

Sentence Similarity

sentence-transformers

feature-extraction

text-embeddings-inference

Model card Files Files and versions

sdadas commited on Dec 11, 2025

Commit

c2b7542

·

verified ·

1 Parent(s): 1682e4c

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -19,7 +19,7 @@ widget:
 <h1 align="center">Stella-PL-retrieval-mini-8k</h1>
 This is an embedding model based on [stella_en_400M_v5](https://huggingface.co/NovaSearch/stella_en_400M_v5) and further fine-tuned for retrieval tasks in Polish. It transforms texts into 1024-dimensional vectors. The model training consisted of two stages:
-- In the first stage, we adapted the model to support the Polish language using the [multilingual knowledge distillation method](https://aclanthology.org/2020.emnlp-main.365/) method, leveraging a diverse corpus of 20 million Polish-English text pairs.
 - The original Stella model and the output of the first stage were limited to a short context of 512 tokens. In the second stage, we extended the context to 8192 tokens and then fine-tuned the model using contrastive loss on a dataset comprising 1.5 million queries. Positive and negative passages for each query have been selected with the help of [BAAI/bge-reranker-v2.5-gemma2-lightweight](https://huggingface.co/BAAI/bge-reranker-v2.5-gemma2-lightweight) reranker. The model was trained for five epochs with a batch size of 1024 queries.
 <table>

 <h1 align="center">Stella-PL-retrieval-mini-8k</h1>
 This is an embedding model based on [stella_en_400M_v5](https://huggingface.co/NovaSearch/stella_en_400M_v5) and further fine-tuned for retrieval tasks in Polish. It transforms texts into 1024-dimensional vectors. The model training consisted of two stages:
+- In the first stage, we adapted the model to support the Polish language using the [multilingual knowledge distillation](https://aclanthology.org/2020.emnlp-main.365/) method, leveraging a diverse corpus of 20 million Polish-English text pairs.
 - The original Stella model and the output of the first stage were limited to a short context of 512 tokens. In the second stage, we extended the context to 8192 tokens and then fine-tuned the model using contrastive loss on a dataset comprising 1.5 million queries. Positive and negative passages for each query have been selected with the help of [BAAI/bge-reranker-v2.5-gemma2-lightweight](https://huggingface.co/BAAI/bge-reranker-v2.5-gemma2-lightweight) reranker. The model was trained for five epochs with a batch size of 1024 queries.
 <table>