Feature Extraction
sentence-transformers
Safetensors
Transformers
mteb
modernbert
custom_code
Eval Results (legacy)
Instructions to use jxm/cde-small-v2 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- sentence-transformers
How to use jxm/cde-small-v2 with sentence-transformers:
from sentence_transformers import SentenceTransformer model = SentenceTransformer("jxm/cde-small-v2", trust_remote_code=True) sentences = [ "The weather is lovely today.", "It's so sunny outside!", "He drove to the stadium." ] embeddings = model.encode(sentences) similarities = model.similarity(embeddings, embeddings) print(similarities.shape) # [3, 3] - Transformers
How to use jxm/cde-small-v2 with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("feature-extraction", model="jxm/cde-small-v2", trust_remote_code=True)# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("jxm/cde-small-v2", trust_remote_code=True, dtype="auto") - Notebooks
- Google Colab
- Kaggle
Clean up README slightly
#7
by tomaarsen HF Staff - opened
README.md
CHANGED
|
@@ -8650,13 +8650,10 @@ model-index:
|
|
| 8650 |
|
| 8651 |
# cde-small-v2
|
| 8652 |
|
| 8653 |
-
|
| 8654 |
-
|
| 8655 |
-
</div>
|
| 8656 |
|
| 8657 |
-
|
| 8658 |
-
|
| 8659 |
-
<a href="github.com/jxmorris12/cde">Github</a>
|
| 8660 |
|
| 8661 |
Our new model that naturally integrates "context tokens" into the embedding process. As of January 13th, 2025, `cde-small-v2` is the best small model (under 400M params) on the [MTEB leaderboard](https://huggingface.co/spaces/mteb/leaderboard) for text embedding models, with an average score of 65.58.
|
| 8662 |
|
|
|
|
| 8650 |
|
| 8651 |
# cde-small-v2
|
| 8652 |
|
| 8653 |
+
> [!NOTE]
|
| 8654 |
+
> **Note on parameter count:** Although HuggingFace reports the size of this model as 281M params, really it can be thought of as 140M. That's because our weights actually contain the weights of two models (dubbed "first stage" and "second stage"), and only the second-stage model is used to compute embeddings at search time.
|
|
|
|
| 8655 |
|
| 8656 |
+
<a href="https://github.com/jxmorris12/cde">Github</a>
|
|
|
|
|
|
|
| 8657 |
|
| 8658 |
Our new model that naturally integrates "context tokens" into the embedding process. As of January 13th, 2025, `cde-small-v2` is the best small model (under 400M params) on the [MTEB leaderboard](https://huggingface.co/spaces/mteb/leaderboard) for text embedding models, with an average score of 65.58.
|
| 8659 |
|