cnmoro
/

static-nomic-384-pten-v2

Feature Extraction

sentence-transformers

static-embeddings

Eval Results (legacy)

Model card Files Files and versions

This Model2Vec model was created by using Tokenlearn, with nomic-embed-text-v2-moe as a base.

The output dimension is 384.

The evaluation in the model card was executed using this distilled model, not the original.

This model was trained in streaming mode over large precomputed feature shards with incremental PCA (384d), vocabulary quantization capped at 32k effective tokens, and fine-tuning optimizations for large-scale data.

This is a better model than cnmoro/static-nomic-384-pten

Usage

Load this model using model2vec library:

from model2vec import StaticModel

model = StaticModel.from_pretrained("cnmoro/static-nomic-384-pten-v2")

# Compute text embeddings
embeddings = model.encode(["Example sentence"])

Or using sentence-transformers library:

from sentence_transformers import SentenceTransformer

model = SentenceTransformer('cnmoro/static-nomic-384-pten-v2')

# Compute text embeddings
embeddings = model.encode(["Example sentence"])

Downloads last month: 98

Safetensors

Model size

12.8M params

Tensor type

I64

·

F32

·

Model tree for cnmoro/static-nomic-384-pten-v2

Base model

FacebookAI/xlm-roberta-base

Finetuned

nomic-ai/nomic-xlm-2048

Finetuned

nomic-ai/nomic-embed-text-v2-moe-unsupervised

Finetuned

nomic-ai/nomic-embed-text-v2-moe

Finetuned

(22)

this model

Dataset used to train cnmoro/static-nomic-384-pten-v2

Evaluation results

pearson on MTEB Assin2STS (default)
test set self-reported

67.072
spearman on MTEB Assin2STS (default)
test set self-reported

61.356
cosine_pearson on MTEB Assin2STS (default)
test set self-reported

67.072
cosine_spearman on MTEB Assin2STS (default)
test set self-reported

61.356
manhattan_pearson on MTEB Assin2STS (default)
test set self-reported

64.137
manhattan_spearman on MTEB Assin2STS (default)
test set self-reported

61.426
euclidean_pearson on MTEB Assin2STS (default)
test set self-reported

64.335
euclidean_spearman on MTEB Assin2STS (default)
test set self-reported

61.356