agraharr/telecom-snowflake-arctic-embed-s

Task: Domain-adapted Sentence Embeddings — Telecom Retrieval, QA Similarity, Semantic Search


Model Overview

This model is a domain-specialized telecom sentence embedding model fine-tuned from Snowflake/snowflake-arctic-embed-s on telecom-domain query/passage pairs and hard-negative triplet evaluation data.

It is intended for telecom-focused retrieval and semantic matching tasks such as:

  • Telecom RAG retrieval
  • Telecom question-answer retrieval
  • KPI / 3GPP / ORAN / radio-network concept search
  • Semantic similarity over telecom text
  • Clustering and deduplication of telecom questions, procedures, and knowledge snippets
  • Candidate retrieval before reranking in telecom assistant pipelines

Base model summary:

  • Base: Snowflake/snowflake-arctic-embed-s
  • Architecture: Sentence Transformer / bi-encoder embedding model
  • Embedding dimension: 384
  • Base model size: ~33M parameters
  • Similarity function: Cosine similarity
  • Primary language: English
  • Domain: Telecom / 5G / ORAN / 3GPP / network operations

The base snowflake-arctic-embed-s model is a compact retrieval-oriented embedding model based on intfloat/e5-small-unsupervised, with 33M parameters and 384-dimensional embeddings. The telecom fine-tuning adapts this model toward domain-specific telecom retrieval while preserving general retrieval behavior through general-domain replay data.


Intended Use

Use this model when you need compact, fast telecom-domain embeddings for:

  • Dense retrieval in RAG
  • Query-to-document search
  • Query-to-answer matching
  • Telecom FAQ / standards retrieval
  • Skill/tool retrieval for telecom agents
  • Similarity search over domain documents
  • Vector search in FAISS, ChromaDB, pgvector, OpenSearch, Vespa, or similar systems

This model is especially useful when you want a small 384-dimensional embedding model that can be used as a drop-in replacement for sentence-transformers/all-MiniLM-L6-v2 style vector indexes.


Not Intended For

This model is not a generative language model. It does not generate answers directly.

It should not be used as the only component for:

  • Factual answer generation
  • Legal, safety-critical, or regulatory decisions
  • Final response generation without a downstream LLM or verifier
  • Non-English multilingual retrieval without separate evaluation

For best results in RAG, use it for retrieval, then pass retrieved context to an LLM or reranker.


How to Use

Installation

pip install -U sentence-transformers

Direct Usage with Sentence Transformers

from sentence_transformers import SentenceTransformer

model = SentenceTransformer("agraharr/telecom-snowflake-arctic-embed-s")

sentences = [
    "What is handover success rate in LTE?",
    "Handover success rate measures successful handovers divided by attempted handovers.",
    "CPU utilization measures how much processing capacity is being used."
]

embeddings = model.encode(sentences, normalize_embeddings=True)
print(embeddings.shape)  # (3, 384)

similarities = model.similarity(embeddings, embeddings)
print(similarities)

Recommended Retrieval Format

During fine-tuning, telecom examples were formatted in query/passage style. For retrieval, use the same convention where possible:

queries = [
    "query: What is the role of AMF in 5G?",
    "query: Explain handover failure causes"
]

passages = [
    "passage: The AMF handles access and mobility management functions in the 5G core.",
    "passage: Handover failures can be caused by radio conditions, neighbor relation issues, PCI conflicts, or parameter misconfiguration."
]

query_embeddings = model.encode(queries, normalize_embeddings=True)
passage_embeddings = model.encode(passages, normalize_embeddings=True)

If your application already indexes raw text without prefixes, evaluate both variants. For new retrieval systems, the query: / passage: format is recommended.


Training Details

Base Model

  • Base model: Snowflake/snowflake-arctic-embed-s
  • Base embedding dimension: 384
  • Base model size: ~33M parameters
  • Base model family: Arctic Embed / E5-small style retrieval model

Training Objective

The model was fine-tuned using contrastive retrieval training with query-positive pairs:

  • Primary loss: CachedMultipleNegativesRankingLoss
  • Optional nested loss: MatryoshkaLoss if enabled during training
  • Training format: (query, positive_passage)
  • Negative strategy: in-batch negatives from other positives in the batch
  • Input formatting: query: ... and passage: ...

Example training pair:

query: What is handover success rate?
passage: Handover success rate is calculated as the ratio of successful handovers to attempted handovers.

Domain Data

The telecom training mix can include:

  • Telco-DPR query-document relevance pairs
  • 3GPP / 5G NR QA pairs
  • ORANBench question-answer pairs
  • GSMA TeleQnA question-answer pairs
  • Curated telecom sentence pairs / triplets

General-Domain Replay

To reduce catastrophic forgetting, the training pipeline can mix telecom-domain data with a general-domain retrieval replay set, such as MS-MARCO query-positive pairs.

Typical ratio:

  • 80% telecom-domain retrieval data
  • 20% general-domain retrieval data

This helps the model adapt to telecom terminology while preserving broader semantic retrieval behavior.

Hyperparameters

The exact values depend on the run. A typical training configuration is:

Hyperparameter Value
Epochs 2
Batch size 32–512 depending on GPU
Learning rate 5e-6 to 1e-5
Max sequence length 512
Warmup ratio 0.10
Weight decay 0.01
AMP / mixed precision Enabled when supported
Frozen lower layers Optional, commonly 0–2 layers
General replay ratio 0.20

Evaluation

This model should be evaluated on both telecom-domain and general-domain retrieval tasks.

Recommended telecom metrics:

  • Recall@1
  • Recall@3
  • Recall@5
  • MRR@10
  • nDCG@10

Recommended evaluation sets:

  • Held-out Telco-DPR relevance mappings
  • Held-out telecom QA pairs
  • Hard-negative telecom triplets
  • General-domain retrieval validation set to monitor forgetting

Example Evaluation Table

Replace the values below with metrics from your training_report.json.

Model Dimension Telecom Recall@1 Telecom Recall@3 MRR@10 nDCG@10 Notes
sentence-transformers/all-MiniLM-L6-v2 384 TODO TODO TODO TODO Generic baseline
Snowflake/snowflake-arctic-embed-s 384 TODO TODO TODO TODO Base model
agraharr/telecom-snowflake-arctic-embed-s 384 TODO TODO TODO TODO Telecom fine-tuned
agraharr/telecom-gte-modernbert-matryoshka 768 / truncated TODO TODO TODO TODO Larger comparison model

Matryoshka / Truncated Embeddings

If this model was trained with MatryoshkaLoss, it can be evaluated at smaller embedding dimensions, for example:

  • 384
  • 256
  • 128
  • 64

Example:

from sentence_transformers import SentenceTransformer

model = SentenceTransformer("agraharr/telecom-snowflake-arctic-embed-s")

# Full 384d embedding
emb_384 = model.encode(["query: What is gNodeB?"], normalize_embeddings=True)

# If Matryoshka training was used, evaluate truncation in your retrieval stack
emb_128 = emb_384[:, :128]

Note: truncation should be used only after validating retrieval quality at the target dimension.


Limitations

  • The model is specialized for telecom-domain semantic retrieval and may not improve every general-domain task.
  • It should be evaluated on your own corpus before production deployment.
  • It may encode telecom-specific terminology better than generic embeddings, but retrieval results still require downstream ranking, filtering, or answer verification for high-stakes use cases.
  • If trained primarily on QA-style pairs, it may perform best on query-answer retrieval and less strongly on long-document retrieval unless trained/evaluated for that use case.
  • The model is English-focused unless additional multilingual data was used and evaluated.

Bias, Risks, and Responsible Use

This model inherits limitations from both the base embedding model and the telecom datasets used for fine-tuning. Retrieved passages may be incomplete, outdated, or semantically similar but operationally incorrect.

For telecom operations, recommendations, root-cause analysis, or configuration changes, retrieved content should be validated against authoritative documentation, engineering procedures, or human expert review.


Example RAG Usage

from sentence_transformers import SentenceTransformer
import numpy as np

model = SentenceTransformer("agraharr/telecom-snowflake-arctic-embed-s")

query = "query: What can cause handover failure in LTE?"
docs = [
    "passage: Handover failure can be caused by poor radio conditions, missing neighbor relations, PCI conflicts, or incorrect mobility parameters.",
    "passage: CPU utilization measures how much processing capacity is used on a server.",
    "passage: The AMF handles registration and mobility management in the 5G core."
]

q_emb = model.encode([query], normalize_embeddings=True)
d_emb = model.encode(docs, normalize_embeddings=True)

scores = np.dot(q_emb, d_emb.T)[0]
ranked = sorted(zip(scores, docs), reverse=True)

for score, doc in ranked:
    print(round(float(score), 4), doc)

Citation

If you use this model, please cite the base model and relevant embedding-training work:

@misc{merrick2024arcticembed,
  title={Arctic-Embed: Scalable, Efficient, and Accurate Text Embedding Models},
  author={Luke Merrick and Danmei Xu and Gaurav Nuti and Daniel Campos},
  year={2024},
  eprint={2405.05374},
  archivePrefix={arXiv},
  primaryClass={cs.CL}
}
@inproceedings{kusupati2022matryoshka,
  title={Matryoshka Representation Learning},
  author={Kusupati, Aditya and Bhatt, Gantavya and Rege, Aniket and Wallingford, Matthew and Sinha, Aditya and Ramanujan, Vivek and Howard-Snyder, Will and Chen, Kaifeng and Kakade, Sham and Jain, Prateek and Farhadi, Ali},
  booktitle={Advances in Neural Information Processing Systems},
  year={2022}
}

Author & Contact

Downloads last month
37
Safetensors
Model size
33.2M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for agraharr/telecom-snowflake-arctic-embed-s

Finetuned
(11)
this model

Paper for agraharr/telecom-snowflake-arctic-embed-s