Model Card: Rocky-Embed

Model Description

rocky-embed is a custom, lightweight Transformer-based text embedding model. It was trained via knowledge distillation using the CohereLabs/wikipedia-2023-11-embed-multilingual-v3-int8-binary dataset as a teacher. The model maps sentences and paragraphs to a 1024-dimensional dense vector space and can be used for tasks like clustering or semantic search.

Architecture Highlights:

Custom Transformer Blocks: Uses RMSNorm for layer normalization and GELU activations.
Positional Embeddings: Implements Rotary Positional Embeddings (RoPE).
Attention: Uses QK Normalization with a learnable temperature parameter.
Parameters:
- Dimensions: 768
- Depth: 12 layers
- Heads: 12
- Projection Dimension: 1024 (matching the teacher model)

Training Details

Dataset: Trained on English Wikipedia snippets.
Objective: Direct Mean Squared Error (MSE) distillation from the normalized embeddings of the teacher model.
Optimizer: AdamW with linear learning rate decay and warmup.

Evaluation Results (STSb)

Spearman Correlation: 0.5453

How to Use

You can load this model directly from the Hugging Face Hub using the transformers library. Since this model uses a custom architecture (RockyForEmbeddings), you must pass trust_remote_code=True when loading it.

import torch
import torch.nn.functional as F
from transformers import AutoTokenizer, AutoModel

# 1. Load the tokenizer and model
model_id = "pranavupadhyaya52/rocky-embed"
tokenizer = AutoTokenizer.from_pretrained(model_id)
# Important: Set trust_remote_code=True to use the custom Rocky architecture
model = AutoModel.from_pretrained(model_id, trust_remote_code=True)

model.eval()

# 2. Prepare your input texts
queries = [
    "What is the capital of France?",
    "Paris is the capital of France.",
    "A completely unrelated sentence about dogs."
]

# 3. Tokenize
inputs = tokenizer(
    queries,
    padding="max_length",
    truncation=True,
    max_length=64,
    return_tensors="pt"
)

# 4. Generate Embeddings
with torch.no_grad():
    # The model outputs the normalized pooled embeddings directly
    embeddings = model(inputs["input_ids"], inputs["attention_mask"])

print("Embeddings shape:", embeddings.shape)

# 5. Compute cosine similarities
query_emb = embeddings[0].unsqueeze(0)
option_embs = embeddings[1:]
similarities = F.cosine_similarity(query_emb, option_embs)

print(f"\nSimilarity with '{queries[1]}': {similarities[0]:.4f}")
print(f"Similarity with '{queries[2]}': {similarities[1]:.4f}")

Downloads last month: 36

Safetensors

Model size

90.9M params

Tensor type

F32

Model tree for fermacsys/rocky-embed

Finetunes

1 model

Dataset used to train fermacsys/rocky-embed

Collection including fermacsys/rocky-embed

RockyEmbed collection

Collection

Collection of Rocky family of embedding models - a 90 million parameter embedding foundational model • 2 items • Updated May 5