Update README.md
Browse files
README.md
CHANGED
|
@@ -76,9 +76,19 @@ model-index:
|
|
| 76 |
license: mit
|
| 77 |
---
|
| 78 |
|
| 79 |
-
# SentenceTransformer
|
| 80 |
|
| 81 |
-
This is a [sentence-transformers](https://www.SBERT.net) model based on
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 82 |
|
| 83 |
## Model Details
|
| 84 |
|
|
|
|
| 76 |
license: mit
|
| 77 |
---
|
| 78 |
|
| 79 |
+
# SentenceTransformer (Legacy)
|
| 80 |
|
| 81 |
+
This is a [sentence-transformers](https://www.SBERT.net) model based on an initial custom ModernBERT-Small architecture, trained from scratch using a multi-stage pipeline including MLM pre-training and semantic fine-tuning. It maps sentences & paragraphs to a **384-dimensional dense vector space**.
|
| 82 |
+
|
| 83 |
+
## Warning
|
| 84 |
+
|
| 85 |
+
This model was an early exploration into creating a Wide model.
|
| 86 |
+
|
| 87 |
+
**⚠️ Legacy Status: NOT RECOMMENDED.**
|
| 88 |
+
|
| 89 |
+
This initial implementation suffered from suboptimal architectural scaling decisions made during the initialization phase, particularly concerning the feed-forward network capacity relative to the depth.
|
| 90 |
+
|
| 91 |
+
**👉 Recommended Successor:** For superior performance, speed, and architectural coherence, please use the improved version: [**`johnnyboycurtis/ModernBERT-small-v2`**](https://huggingface.co/johnnyboycurtis/ModernBERT-small-v2). The successor model addresses these limitations via a more sophisticated Guided Weight Initialization (GUIDE) technique and specialized Knowledge Distillation tuning.
|
| 92 |
|
| 93 |
## Model Details
|
| 94 |
|