Instructions to use agraharr/telecom-snowflake-arctic-embed-s with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- sentence-transformers
How to use agraharr/telecom-snowflake-arctic-embed-s with sentence-transformers:
from sentence_transformers import SentenceTransformer model = SentenceTransformer("agraharr/telecom-snowflake-arctic-embed-s") sentences = [ "That is a happy person", "That is a happy dog", "That is a very happy person", "Today is a sunny day" ] embeddings = model.encode(sentences) similarities = model.similarity(embeddings, embeddings) print(similarities.shape) # [4, 4] - Notebooks
- Google Colab
- Kaggle
agraharr/telecom-snowflake-arctic-embed-s
Task: Domain-adapted Sentence Embeddings — Telecom Retrieval, QA Similarity, Semantic Search
Model Overview
This model is a domain-specialized telecom sentence embedding model fine-tuned from Snowflake/snowflake-arctic-embed-s on telecom-domain query/passage pairs and hard-negative triplet evaluation data.
It is intended for telecom-focused retrieval and semantic matching tasks such as:
- Telecom RAG retrieval
- Telecom question-answer retrieval
- KPI / 3GPP / ORAN / radio-network concept search
- Semantic similarity over telecom text
- Clustering and deduplication of telecom questions, procedures, and knowledge snippets
- Candidate retrieval before reranking in telecom assistant pipelines
Base model summary:
- Base:
Snowflake/snowflake-arctic-embed-s - Architecture: Sentence Transformer / bi-encoder embedding model
- Embedding dimension: 384
- Base model size: ~33M parameters
- Similarity function: Cosine similarity
- Primary language: English
- Domain: Telecom / 5G / ORAN / 3GPP / network operations
The base snowflake-arctic-embed-s model is a compact retrieval-oriented embedding model based on intfloat/e5-small-unsupervised, with 33M parameters and 384-dimensional embeddings. The telecom fine-tuning adapts this model toward domain-specific telecom retrieval while preserving general retrieval behavior through general-domain replay data.
Intended Use
Use this model when you need compact, fast telecom-domain embeddings for:
- Dense retrieval in RAG
- Query-to-document search
- Query-to-answer matching
- Telecom FAQ / standards retrieval
- Skill/tool retrieval for telecom agents
- Similarity search over domain documents
- Vector search in FAISS, ChromaDB, pgvector, OpenSearch, Vespa, or similar systems
This model is especially useful when you want a small 384-dimensional embedding model that can be used as a drop-in replacement for sentence-transformers/all-MiniLM-L6-v2 style vector indexes.
Not Intended For
This model is not a generative language model. It does not generate answers directly.
It should not be used as the only component for:
- Factual answer generation
- Legal, safety-critical, or regulatory decisions
- Final response generation without a downstream LLM or verifier
- Non-English multilingual retrieval without separate evaluation
For best results in RAG, use it for retrieval, then pass retrieved context to an LLM or reranker.
How to Use
Installation
pip install -U sentence-transformers
Direct Usage with Sentence Transformers
from sentence_transformers import SentenceTransformer
model = SentenceTransformer("agraharr/telecom-snowflake-arctic-embed-s")
sentences = [
"What is handover success rate in LTE?",
"Handover success rate measures successful handovers divided by attempted handovers.",
"CPU utilization measures how much processing capacity is being used."
]
embeddings = model.encode(sentences, normalize_embeddings=True)
print(embeddings.shape) # (3, 384)
similarities = model.similarity(embeddings, embeddings)
print(similarities)
Recommended Retrieval Format
During fine-tuning, telecom examples were formatted in query/passage style. For retrieval, use the same convention where possible:
queries = [
"query: What is the role of AMF in 5G?",
"query: Explain handover failure causes"
]
passages = [
"passage: The AMF handles access and mobility management functions in the 5G core.",
"passage: Handover failures can be caused by radio conditions, neighbor relation issues, PCI conflicts, or parameter misconfiguration."
]
query_embeddings = model.encode(queries, normalize_embeddings=True)
passage_embeddings = model.encode(passages, normalize_embeddings=True)
If your application already indexes raw text without prefixes, evaluate both variants. For new retrieval systems, the query: / passage: format is recommended.
Training Details
Base Model
- Base model:
Snowflake/snowflake-arctic-embed-s - Base embedding dimension: 384
- Base model size: ~33M parameters
- Base model family: Arctic Embed / E5-small style retrieval model
Training Objective
The model was fine-tuned using contrastive retrieval training with query-positive pairs:
- Primary loss:
CachedMultipleNegativesRankingLoss - Optional nested loss:
MatryoshkaLossif enabled during training - Training format:
(query, positive_passage) - Negative strategy: in-batch negatives from other positives in the batch
- Input formatting:
query: ...andpassage: ...
Example training pair:
query: What is handover success rate?
passage: Handover success rate is calculated as the ratio of successful handovers to attempted handovers.
Domain Data
The telecom training mix can include:
- Telco-DPR query-document relevance pairs
- 3GPP / 5G NR QA pairs
- ORANBench question-answer pairs
- GSMA TeleQnA question-answer pairs
- Curated telecom sentence pairs / triplets
General-Domain Replay
To reduce catastrophic forgetting, the training pipeline can mix telecom-domain data with a general-domain retrieval replay set, such as MS-MARCO query-positive pairs.
Typical ratio:
- 80% telecom-domain retrieval data
- 20% general-domain retrieval data
This helps the model adapt to telecom terminology while preserving broader semantic retrieval behavior.
Hyperparameters
The exact values depend on the run. A typical training configuration is:
| Hyperparameter | Value |
|---|---|
| Epochs | 2 |
| Batch size | 32–512 depending on GPU |
| Learning rate | 5e-6 to 1e-5 |
| Max sequence length | 512 |
| Warmup ratio | 0.10 |
| Weight decay | 0.01 |
| AMP / mixed precision | Enabled when supported |
| Frozen lower layers | Optional, commonly 0–2 layers |
| General replay ratio | 0.20 |
Evaluation
This model should be evaluated on both telecom-domain and general-domain retrieval tasks.
Recommended telecom metrics:
Recall@1Recall@3Recall@5MRR@10nDCG@10
Recommended evaluation sets:
- Held-out Telco-DPR relevance mappings
- Held-out telecom QA pairs
- Hard-negative telecom triplets
- General-domain retrieval validation set to monitor forgetting
Example Evaluation Table
Replace the values below with metrics from your training_report.json.
| Model | Dimension | Telecom Recall@1 | Telecom Recall@3 | MRR@10 | nDCG@10 | Notes |
|---|---|---|---|---|---|---|
sentence-transformers/all-MiniLM-L6-v2 |
384 | TODO | TODO | TODO | TODO | Generic baseline |
Snowflake/snowflake-arctic-embed-s |
384 | TODO | TODO | TODO | TODO | Base model |
agraharr/telecom-snowflake-arctic-embed-s |
384 | TODO | TODO | TODO | TODO | Telecom fine-tuned |
agraharr/telecom-gte-modernbert-matryoshka |
768 / truncated | TODO | TODO | TODO | TODO | Larger comparison model |
Matryoshka / Truncated Embeddings
If this model was trained with MatryoshkaLoss, it can be evaluated at smaller embedding dimensions, for example:
- 384
- 256
- 128
- 64
Example:
from sentence_transformers import SentenceTransformer
model = SentenceTransformer("agraharr/telecom-snowflake-arctic-embed-s")
# Full 384d embedding
emb_384 = model.encode(["query: What is gNodeB?"], normalize_embeddings=True)
# If Matryoshka training was used, evaluate truncation in your retrieval stack
emb_128 = emb_384[:, :128]
Note: truncation should be used only after validating retrieval quality at the target dimension.
Limitations
- The model is specialized for telecom-domain semantic retrieval and may not improve every general-domain task.
- It should be evaluated on your own corpus before production deployment.
- It may encode telecom-specific terminology better than generic embeddings, but retrieval results still require downstream ranking, filtering, or answer verification for high-stakes use cases.
- If trained primarily on QA-style pairs, it may perform best on query-answer retrieval and less strongly on long-document retrieval unless trained/evaluated for that use case.
- The model is English-focused unless additional multilingual data was used and evaluated.
Bias, Risks, and Responsible Use
This model inherits limitations from both the base embedding model and the telecom datasets used for fine-tuning. Retrieved passages may be incomplete, outdated, or semantically similar but operationally incorrect.
For telecom operations, recommendations, root-cause analysis, or configuration changes, retrieved content should be validated against authoritative documentation, engineering procedures, or human expert review.
Example RAG Usage
from sentence_transformers import SentenceTransformer
import numpy as np
model = SentenceTransformer("agraharr/telecom-snowflake-arctic-embed-s")
query = "query: What can cause handover failure in LTE?"
docs = [
"passage: Handover failure can be caused by poor radio conditions, missing neighbor relations, PCI conflicts, or incorrect mobility parameters.",
"passage: CPU utilization measures how much processing capacity is used on a server.",
"passage: The AMF handles registration and mobility management in the 5G core."
]
q_emb = model.encode([query], normalize_embeddings=True)
d_emb = model.encode(docs, normalize_embeddings=True)
scores = np.dot(q_emb, d_emb.T)[0]
ranked = sorted(zip(scores, docs), reverse=True)
for score, doc in ranked:
print(round(float(score), 4), doc)
Citation
If you use this model, please cite the base model and relevant embedding-training work:
@misc{merrick2024arcticembed,
title={Arctic-Embed: Scalable, Efficient, and Accurate Text Embedding Models},
author={Luke Merrick and Danmei Xu and Gaurav Nuti and Daniel Campos},
year={2024},
eprint={2405.05374},
archivePrefix={arXiv},
primaryClass={cs.CL}
}
@inproceedings{kusupati2022matryoshka,
title={Matryoshka Representation Learning},
author={Kusupati, Aditya and Bhatt, Gantavya and Rege, Aniket and Wallingford, Matthew and Sinha, Aditya and Ramanujan, Vivek and Howard-Snyder, Will and Chen, Kaifeng and Kakade, Sham and Jain, Prateek and Farhadi, Ali},
booktitle={Advances in Neural Information Processing Systems},
year={2022}
}
Author & Contact
- Trained/curated by:
agraharr - Hugging Face profile: https://huggingface.co/agraharr
- Model repository: https://huggingface.co/agraharr/telecom-snowflake-arctic-embed-s
- Downloads last month
- 37
Model tree for agraharr/telecom-snowflake-arctic-embed-s
Base model
Snowflake/snowflake-arctic-embed-s