electroglyph commited on
Commit
68154e2
·
verified ·
1 Parent(s): f9e0a39

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -3
README.md CHANGED
@@ -20,15 +20,15 @@ This model was finetuned with [Unsloth](https://github.com/unslothai/unsloth).
20
  [<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)
21
  based on Alibaba-NLP/gte-modernbert-base
22
 
23
- This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [Alibaba-NLP/gte-modernbert-base](https://huggingface.co/Alibaba-NLP/gte-modernbert-base) on the json dataset. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
24
 
25
  This model is finetuned specifically for fiction retrieval. It's been trained on sci-fi, fantasy, mystery, and other fiction genres.
26
 
27
  Dataset size: 800k rows based on 100% manually cleaned data.
28
 
29
- This model surpasses Qwen3 4B embedding model on my test set (40k examples with hard negatives) by 0.5%.
30
 
31
- Model accuracy increased from 90.8% to 95.7% on the test set.
32
 
33
  Some MTEB benchmarks saw some pretty big losses, they're detailed below.
34
 
 
20
  [<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)
21
  based on Alibaba-NLP/gte-modernbert-base
22
 
23
+ This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [Alibaba-NLP/gte-modernbert-base](https://huggingface.co/Alibaba-NLP/gte-modernbert-base). It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
24
 
25
  This model is finetuned specifically for fiction retrieval. It's been trained on sci-fi, fantasy, mystery, and other fiction genres.
26
 
27
  Dataset size: 800k rows based on 100% manually cleaned data.
28
 
29
+ This model surpasses Qwen3 4B embedding model on my test split benchmark (40k examples with hard negatives) by 0.5%.
30
 
31
+ Model accuracy increased from 90.8% to 95.7% on the test split.
32
 
33
  Some MTEB benchmarks saw some pretty big losses, they're detailed below.
34