Update README.md
Browse files
README.md
CHANGED
|
@@ -8,9 +8,16 @@ A set of embedding model trained for study embedding quality vs model architectu
|
|
| 8 |
|
| 9 |
- **cat-emb-2-128**: 2 layers/hidden size 128/4.4m
|
| 10 |
- **cat-emb-4-128**: 4 layers/H 128/4.8m
|
| 11 |
-
- **cat-emb-6-128**: 6 layers/H 128/5.2m
|
| 12 |
- **cat-emb-8-128**: 8 layers/H 128/5.6m
|
| 13 |
-
- **cat-emb-10-128**: 10 layers/H 128/6.0m
|
| 14 |
- **cat-emb-12-128**: 12 layers/H 128/6.4m
|
| 15 |
- **cat-emb-2-256**: 2 layers/H 256/9.7m
|
| 16 |
-
- **cat-emb-4-256**: 4 layers/H 256/11.3m
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 8 |
|
| 9 |
- **cat-emb-2-128**: 2 layers/hidden size 128/4.4m
|
| 10 |
- **cat-emb-4-128**: 4 layers/H 128/4.8m
|
|
|
|
| 11 |
- **cat-emb-8-128**: 8 layers/H 128/5.6m
|
|
|
|
| 12 |
- **cat-emb-12-128**: 12 layers/H 128/6.4m
|
| 13 |
- **cat-emb-2-256**: 2 layers/H 256/9.7m
|
| 14 |
+
- **cat-emb-4-256**: 4 layers/H 256/11.3m
|
| 15 |
+
|
| 16 |
+
### Training
|
| 17 |
+
|
| 18 |
+
- stage 1: seq 192, batch size 2048, 50k steps, sentence pairs.
|
| 19 |
+
- stage 2: seq 512, batch size 64, 5k steps, sentence triplets.
|
| 20 |
+
|
| 21 |
+
### Perf
|
| 22 |
+
|
| 23 |
+
|