alanjoshua2005 commited on
Commit
6001257
·
verified ·
1 Parent(s): ba3a242

Upload folder using huggingface_hub

Browse files
README.md CHANGED
@@ -1,3 +1,57 @@
1
- ---
2
- license: mit
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language: en
3
+ tags:
4
+ - sentence-transformers
5
+ - embeddings
6
+ - semantic-search
7
+ - retrieval
8
+ license: mit
9
+ ---
10
+
11
+ # BiEncoder RoPE — Sentence Embedding Model
12
+
13
+ A 34M parameter sentence embedding model trained from scratch using PyTorch.
14
+
15
+ ## Architecture
16
+ - 6-layer Transformer encoder with RoPE positional embeddings
17
+ - Mean pooling + L2 normalization
18
+ - 256-dim output vectors
19
+
20
+ ## Training (Curriculum)
21
+ | Phase | Dataset | Loss |
22
+ |---|---|---|
23
+ | 1 | all-nli | MNRLoss |
24
+ | 2 | squad | MNRLoss |
25
+ | 3 | msmarco-bm25 | HardNegativeLoss |
26
+ | 4 | natural-questions | MNRLoss |
27
+
28
+ ## Files
29
+ - `tokenizer/` — HuggingFace tokenizer (bert-base-uncased)
30
+ - `pytorch/checkpoint_phase4_nq.pt` — PyTorch weights
31
+ - `onnx/biencoder_rope.onnx` — ONNX FP32
32
+ - `onnx/biencoder_rope_int8.onnx` — ONNX INT8 (recommended for CPU)
33
+
34
+ ## Usage
35
+ ```python
36
+ import torch
37
+ from transformers import AutoTokenizer
38
+
39
+ tokenizer = AutoTokenizer.from_pretrained("your-username/your-model-name", subfolder="tokenizer")
40
+ model = BiEncoderRoPE().to("cuda")
41
+ model.load_state_dict(
42
+ torch.load("pytorch/checkpoint_phase4_nq.pt")["model_state"]
43
+ )
44
+ model.eval()
45
+
46
+ @torch.no_grad()
47
+ def encode(texts):
48
+ if isinstance(texts, str): texts = [texts]
49
+ enc = tokenizer(texts, padding=True, truncation=True,
50
+ max_length=256, return_tensors="pt")
51
+ return model.encode(enc["input_ids"].cuda(), enc["attention_mask"].cuda()).cpu()
52
+ ```
53
+
54
+ ## Performance
55
+ - FP32 ONNX size : 134.3 MB
56
+ - INT8 ONNX size : 34.6 MB
57
+ - Throughput : ~247 sentences/sec on CPU
onnx/biencoder_rope.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:2b4f3959a339f71bb506da4595d23bde7358a70f4a188286ece4b9f4dcf2d004
3
+ size 140864188
onnx/biencoder_rope_int8.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:3585cf0eb03c22ce6005097068c534b6c15accef55150cc71824051524af2061
3
+ size 36265371
pytorch/checkpoint_phase4_nq.pt ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:a94028c66ce14e6802c17667a469af03e37fd0ca0f63118fe194dde150f9c18b
3
+ size 425475351
tokenizer/tokenizer.json ADDED
The diff for this file is too large to render. See raw diff
 
tokenizer/tokenizer_config.json ADDED
@@ -0,0 +1,14 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "backend": "tokenizers",
3
+ "cls_token": "[CLS]",
4
+ "do_lower_case": true,
5
+ "is_local": false,
6
+ "mask_token": "[MASK]",
7
+ "model_max_length": 512,
8
+ "pad_token": "[PAD]",
9
+ "sep_token": "[SEP]",
10
+ "strip_accents": null,
11
+ "tokenize_chinese_chars": true,
12
+ "tokenizer_class": "BertTokenizer",
13
+ "unk_token": "[UNK]"
14
+ }