--- language: en license: apache-2.0 tags: - sentence-transformers - sentence-similarity - embedding - knowledge-distillation datasets: - sentence-transformers/all-nli metrics: - cosine_similarity pipeline_tag: sentence-similarity --- # PawanEmbd-68M A 68M parameter embedding model distilled from Granite-278M ## Model Details - **Model Type**: Sentence Embedding Model - **Architecture**: Transformer-based encoder with projection layer - **Parameters**: ~68 million - **Teacher Model**: IBM Granite-278M Multilingual Embedding - **Training Method**: Knowledge Distillation - **Output Dimensions**: 768 - **Max Sequence Length**: 512 tokens ## Training Details This model was trained using knowledge distillation from the [IBM Granite-278M](https://huggingface.co/ibm-granite/granite-embedding-278m-multilingual) teacher model on the All-NLI dataset (SNLI + MultiNLI). ### Training Hyperparameters - **Dataset**: sentence-transformers/all-nli (100K samples) - **Epochs**: 20 - **Batch Size**: 32 - **Learning Rate**: 5e-4 with OneCycleLR scheduler - **Loss Function**: Combined MSE + Cosine Similarity (α=0.5, β=0.5) - **Mixed Precision**: FP16 (AMP) - **Hardware**: NVIDIA T4 GPU ## Usage ### Using Transformers ```Python from transformers import AutoModel, AutoTokenizer import torch import torch.nn.functional as F # Load model and tokenizer model = AutoModel.from_pretrained("dmedhi/PawanEmbd-68M") tokenizer = AutoTokenizer.from_pretrained("dmedhi/PawanEmbd-68M") # Encode sentences sentences = ["This is an example sentence", "Each sentence is converted to a vector"] encoded = tokenizer(sentences, padding=True, truncation=True, return_tensors='pt') # Get embeddings with torch.no_grad(): outputs = model(**encoded) embeddings = outputs.pooler_output # Already normalized # Compute similarity similarity = F.cosine_similarity(embeddings[0:1], embeddings[1:2]) print(f"Similarity: {similarity.item():.4f}") ``` ### Using Sentence-Transformers ```Python from sentence_transformers import SentenceTransformer from sentence_transformers.util import cos_sim # Load your model (should work now!) model = SentenceTransformer("dmedhi/PawanEmbd-68M") # Test encoding sentences = ["This is an example sentence", "Each sentence is converted to a vector"] embeddings = model.encode(sentences) print(f"✅ Embeddings shape: {embeddings.shape}") # Compute similarity similarity = cos_sim(embeddings[0], embeddings[1]) print(f"✅ Similarity: {similarity.item():.4f}") ``` ## Performance ### Comparison with Teacher Model | Metric | Teacher (Granite-278M) | Student (PawanEmbd-68M) | |--------|----------------------|----------------------| | Parameters | 278M | 68M (4.1x smaller) | | Model Size | ~1.1 GB | ~258.7 MB | | Inference Speed (CPU) | 269.57 ms | 11.57 (23.3x faster) | | Inference Speed (GPU) | 17.94.57 ms | 2.75 (6.5x faster) | | Cosine Similarity | 1.000 | 0.943 | ## Intended Uses This model is suitable for: ✅ **Semantic Search**: Find similar documents or passages \ ✅ **Clustering**: Group similar texts together \ ✅ **Duplicate Detection**: Identify near-duplicate content \ ✅ **Recommendation Systems**: Find similar items \ ✅ **Question Answering**: Retrieve relevant passages \ ✅ **Sentence Similarity**: Measure semantic similarity between texts ## Training Code The model was trained using PyTorch with knowledge distillation. Training code available at: TODO ## Citation ``` @misc{pawanembdmodel2025, author = {Dipankar Medhi}, title = {PawanEmbd: A Lightweight Embedding Model via Knowledge Distillation}, year = {2025}, publisher = {Hugging Face}, howpublished = { \url{https://huggingface.co/dmedhi/PawanEmbd-68M} } } ``` ## Acknowledgments - Teacher model: [IBM Granite-278M](https://huggingface.co/ibm-granite/granite-embedding-278m-multilingual) - Training data: [Sentence-Transformers All-NLI](https://huggingface.co/datasets/sentence-transformers/all-nli) - Framework: Hugging Face Transformers & PyTorch ## License Apache 2.0 ## Contact For questions or feedback, please open an issue on Github.