dmedhi commited on
Commit
0244931
·
verified ·
1 Parent(s): 988445b

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +4 -22
README.md CHANGED
@@ -21,7 +21,7 @@ A 68M parameter embedding model distilled from Granite-278M
21
 
22
  - **Model Type**: Sentence Embedding Model
23
  - **Architecture**: Transformer-based encoder with projection layer
24
- - **Parameters**: ~5 million
25
  - **Teacher Model**: IBM Granite-278M Multilingual Embedding
26
  - **Training Method**: Knowledge Distillation
27
  - **Output Dimensions**: 768
@@ -46,7 +46,7 @@ This model was trained using knowledge distillation from the [IBM Granite-278M](
46
 
47
  ### Using Transformers
48
 
49
- ```
50
  from transformers import AutoModel, AutoTokenizer
51
  import torch
52
  import torch.nn.functional as F
@@ -72,7 +72,7 @@ print(f"Similarity: {similarity.item():.4f}")
72
 
73
  ### Using Sentence-Transformers
74
 
75
- ```
76
  from sentence_transformers import SentenceTransformer
77
  from sentence_transformers.util import cos_sim
78
 
@@ -90,24 +90,6 @@ similarity = cos_sim(embeddings[0], embeddings[1])
90
  print(f"✅ Similarity: {similarity.item():.4f}")
91
  ```
92
 
93
- ======================================================================
94
- COMPARING INFERENCE SPEED (Student vs Teacher)
95
- ======================================================================
96
- Average inference time over 100 runs with 10 sentences (max_length=128):
97
- Teacher Model: 17.944 ms
98
- Student Model: 2.759 ms
99
- Student is 6.5x faster than Teacher.
100
-
101
- CPU speed comparision
102
-
103
- ======================================================================
104
- COMPARING INFERENCE SPEED (Student vs Teacher)
105
- ======================================================================
106
- Average inference time over 100 runs with 10 sentences (max_length=128):
107
- Teacher Model: 269.578 ms
108
- Student Model: 11.577 ms
109
- Student is 23.3x faster than Teacher.
110
-
111
  ## Performance
112
 
113
  ### Comparison with Teacher Model
@@ -145,7 +127,7 @@ The model was trained using PyTorch with knowledge distillation. Training code a
145
  title = {PawanEmbd: A Lightweight Embedding Model via Knowledge Distillation},
146
  year = {2025},
147
  publisher = {Hugging Face},
148
- howpublished = { \url{https://huggingface.co/dmedhi/pawanembd-68m} }
149
  }
150
  ```
151
 
 
21
 
22
  - **Model Type**: Sentence Embedding Model
23
  - **Architecture**: Transformer-based encoder with projection layer
24
+ - **Parameters**: ~68 million
25
  - **Teacher Model**: IBM Granite-278M Multilingual Embedding
26
  - **Training Method**: Knowledge Distillation
27
  - **Output Dimensions**: 768
 
46
 
47
  ### Using Transformers
48
 
49
+ ```Python
50
  from transformers import AutoModel, AutoTokenizer
51
  import torch
52
  import torch.nn.functional as F
 
72
 
73
  ### Using Sentence-Transformers
74
 
75
+ ```Python
76
  from sentence_transformers import SentenceTransformer
77
  from sentence_transformers.util import cos_sim
78
 
 
90
  print(f"✅ Similarity: {similarity.item():.4f}")
91
  ```
92
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
93
  ## Performance
94
 
95
  ### Comparison with Teacher Model
 
127
  title = {PawanEmbd: A Lightweight Embedding Model via Knowledge Distillation},
128
  year = {2025},
129
  publisher = {Hugging Face},
130
+ howpublished = { \url{https://huggingface.co/dmedhi/PawanEmbd-68M} }
131
  }
132
  ```
133