jdpruett commited on
Commit
43ad951
·
verified ·
1 Parent(s): a2cd670

Upload SentenceTransformer export

Browse files
Files changed (2) hide show
  1. 1_Pooling/config.json +2 -2
  2. README.md +28 -40
1_Pooling/config.json CHANGED
@@ -1,7 +1,7 @@
1
  {
2
  "word_embedding_dimension": 384,
3
- "pooling_mode_cls_token": true,
4
- "pooling_mode_mean_tokens": false,
5
  "pooling_mode_max_tokens": false,
6
  "pooling_mode_mean_sqrt_len_tokens": false,
7
  "pooling_mode_weightedmean_tokens": false,
 
1
  {
2
  "word_embedding_dimension": 384,
3
+ "pooling_mode_cls_token": false,
4
+ "pooling_mode_mean_tokens": true,
5
  "pooling_mode_max_tokens": false,
6
  "pooling_mode_mean_sqrt_len_tokens": false,
7
  "pooling_mode_weightedmean_tokens": false,
README.md CHANGED
@@ -6,16 +6,11 @@ tags:
6
  - dense
7
  pipeline_tag: sentence-similarity
8
  library_name: sentence-transformers
9
- license: apache-2.0
10
- language:
11
- - en
12
- base_model:
13
- - intfloat/e5-small-unsupervised
14
  ---
15
 
16
  # SentenceTransformer
17
 
18
- This is a [sentence-transformers](https://www.SBERT.net) model that maps sentences & paragraphs to a 1024-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
19
 
20
  ## Model Details
21
 
@@ -40,7 +35,7 @@ This is a [sentence-transformers](https://www.SBERT.net) model that maps sentenc
40
  ```
41
  SentenceTransformer(
42
  (0): Transformer({'max_seq_length': 512, 'do_lower_case': False, 'architecture': 'BertModel'})
43
- (1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
44
  (2): Dense({'in_features': 384, 'out_features': 1024, 'bias': True, 'activation_function': 'torch.nn.modules.linear.Identity'})
45
  )
46
  ```
@@ -74,9 +69,9 @@ print(embeddings.shape)
74
  # Get the similarity scores for the embeddings
75
  similarities = model.similarity(embeddings, embeddings)
76
  print(similarities)
77
- # tensor([[1.0000, 0.8807, 0.8350],
78
- # [0.8807, 1.0000, 0.8025],
79
- # [0.8350, 0.8025, 1.0000]])
80
  ```
81
 
82
  <!--
@@ -116,35 +111,6 @@ You can finetune this model on your own dataset.
116
  -->
117
 
118
  ## Training Details
119
- This model was trained via a re-implementation of the **LEAF** distillation framework (Vujanic & Rueckstiess, 2025). As no official training code was released, the procedure was independently reproduced based on the paper specification.
120
-
121
- The student backbone is `intfloat/e5-small-unsupervised` (Wang et al., 2022). Training aligns student embeddings to a stronger teacher model through representation-level distillation, rather than contrastive loss with hard negatives. The objective encourages geometric alignment in embedding space using large-scale unlabeled text.
122
-
123
- A projection layer expands the 384-dimensional backbone representations to 1024 dimensions.
124
-
125
- ## Citation
126
-
127
- If you use this model, please cite:
128
-
129
- ```bibtex
130
- @misc{mdbr_leaf,
131
- title={LEAF: Knowledge Distillation of Text Embedding Models with Teacher-Aligned Representations},
132
- author={Robin Vujanic and Thomas Rueckstiess},
133
- year={2025},
134
- eprint={2509.12539},
135
- archivePrefix={arXiv},
136
- primaryClass={cs.IR},
137
- url={https://arxiv.org/abs/2509.12539}
138
- }
139
-
140
- @article{wang2022e5,
141
- title={Text Embeddings by Weakly-Supervised Contrastive Pre-training},
142
- author={Wang, Liang and Yang, Nan and Huang, Xiaolong and Jiao, Binxing and Yang, Linjun and Jiang, Daxin and Majumder, Rangan and Wei, Furu},
143
- journal={arXiv preprint arXiv:2212.03533},
144
- year={2022},
145
- url={https://arxiv.org/abs/2212.03533}
146
- }
147
- ```
148
 
149
  ### Framework Versions
150
  - Python: 3.11.5
@@ -153,4 +119,26 @@ If you use this model, please cite:
153
  - PyTorch: 2.10.0+cu128
154
  - Accelerate:
155
  - Datasets:
156
- - Tokenizers: 0.22.2
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
6
  - dense
7
  pipeline_tag: sentence-similarity
8
  library_name: sentence-transformers
 
 
 
 
 
9
  ---
10
 
11
  # SentenceTransformer
12
 
13
+ This is a [sentence-transformers](https://www.SBERT.net) model trained. It maps sentences & paragraphs to a 1024-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
14
 
15
  ## Model Details
16
 
 
35
  ```
36
  SentenceTransformer(
37
  (0): Transformer({'max_seq_length': 512, 'do_lower_case': False, 'architecture': 'BertModel'})
38
+ (1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
39
  (2): Dense({'in_features': 384, 'out_features': 1024, 'bias': True, 'activation_function': 'torch.nn.modules.linear.Identity'})
40
  )
41
  ```
 
69
  # Get the similarity scores for the embeddings
70
  similarities = model.similarity(embeddings, embeddings)
71
  print(similarities)
72
+ # tensor([[1.0000, 0.8101, 0.4448],
73
+ # [0.8101, 1.0000, 0.4627],
74
+ # [0.4448, 0.4627, 1.0000]])
75
  ```
76
 
77
  <!--
 
111
  -->
112
 
113
  ## Training Details
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
114
 
115
  ### Framework Versions
116
  - Python: 3.11.5
 
119
  - PyTorch: 2.10.0+cu128
120
  - Accelerate:
121
  - Datasets:
122
+ - Tokenizers: 0.22.2
123
+
124
+ ## Citation
125
+
126
+ ### BibTeX
127
+
128
+ <!--
129
+ ## Glossary
130
+
131
+ *Clearly define terms in order to be accessible across audiences.*
132
+ -->
133
+
134
+ <!--
135
+ ## Model Card Authors
136
+
137
+ *Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
138
+ -->
139
+
140
+ <!--
141
+ ## Model Card Contact
142
+
143
+ *Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
144
+ -->