electroglyph's picture
Update README.md
d35e872 verified
metadata
license: apache-2.0
pipeline_tag: sentence-similarity
tags:
  - sentence-transformers
  - feature-extraction
  - sentence-similarity
  - mteb
  - arctic
  - snowflake-arctic-embed
  - transformers.js

arctic-embed-l-tech_and_fiction

This is a finetuned version of: snowflake-arctic-embed-l

It is finetuned on a dataset of ~110K high quality synthetic examples, manually curated and edited.

The examples are primarily tech oriented, with some terms from fantasy and sci-fi fiction for good measure.

Training

For this model (Model B below) I trained at rank 128, alpha 128, with a learning rate of around 6e-6 I think, for around 10 epochs.

Benchmark Results

I've included the results for MTEB benchmark "MTEB(eng, v2)" in the results folder.

Here is a screenshot of the results summary (this is Model B):

benchmark

License

arctic-embed-l-tech_and_fiction is licensed under the Apache-2. The released models can be used for commercial purposes free of charge.

Acknowledgement

Thank you to the Snowflake team for making some excellent models!