Update README.md
Browse files
README.md
CHANGED
|
@@ -17,29 +17,30 @@ tags:
|
|
| 17 |
|
| 18 |
# CodeModernBERT-Finch
|
| 19 |
|
| 20 |
-
|
| 21 |
-
|
| 22 |
-
|
| 23 |
-
|
| 24 |
-
|
| 25 |
-
|
| 26 |
-
|
| 27 |
-
|
| 28 |
-
|
| 29 |
-
|
| 30 |
-
|
| 31 |
-
|
| 32 |
-
|
| 33 |
-
|
| 34 |
-
|
| 35 |
-
|
| 36 |
-
|
| 37 |
-
|
|
| 38 |
-
| Finch
|
| 39 |
-
|
|
| 40 |
-
|
|
| 41 |
-
| Owl-4.1-Small-Fine-tuned
|
| 42 |
-
|
|
|
|
|
| 43 |
|
| 44 |
---
|
| 45 |
|
|
|
|
| 17 |
|
| 18 |
# CodeModernBERT-Finch
|
| 19 |
|
| 20 |
+
This model is a code-specific pretrained model created solely using the CodeSearchNet dataset. It supports six languages included in CodeSearchNet.\
|
| 21 |
+
For a version fine-tuned specifically for code search tasks, please refer to [Shuu12121/CodeSearch-ModernBERT-Finch](https://huggingface.co/Shuu12121/CodeSearch-ModernBERT-Finch).
|
| 22 |
+
|
| 23 |
+
## Architecture
|
| 24 |
+
|
| 25 |
+
* Base: ModernBERT-style encoder
|
| 26 |
+
* Hidden size: 512
|
| 27 |
+
* Layers: 6
|
| 28 |
+
* Attention heads: 6
|
| 29 |
+
* Parameters: \~50M
|
| 30 |
+
* Pretraining: Masked Language Modeling (MLM)
|
| 31 |
+
* Fine-tuning: Domain-specific code tasks
|
| 32 |
+
|
| 33 |
+
The results below were obtained by randomly sampling 10,000 examples per language from the CodeSearchNet dataset, training them in a Sentence-BERT fashion, and evaluating on the MTEB CodeSearchNetRetrieval benchmark.
|
| 34 |
+
All models listed in the table below were fine-tuned using the same approach. Those marked with 200 and the Finch models were trained with a Multiple Negatives Ranking Loss batch size of 200. Others were trained with a batch size of 40 (because larger batches could not fit into memory).
|
| 35 |
+
|
| 36 |
+
| Model | go | java | javascript | php | python | ruby |
|
| 37 |
+
| ---------------------------------- | ----- | ----- | ---------- | ----- | ------ | ----- |
|
| 38 |
+
| Finch(40M) | 0.934 | 0.784 | 0.728 | 0.835 | 0.865 | 0.756 |
|
| 39 |
+
| Finch-Pre(40M) | 0.937 | 0.705 | 0.685 | 0.828 | 0.843 | 0.725 |
|
| 40 |
+
| base-Finetuned(149M) | 0.933 | 0.779 | 0.748 | 0.839 | 0.885 | 0.794 |
|
| 41 |
+
| Owl-4.1-Small-Fine-tuned(151M) | 0.942 | 0.780 | 0.729 | 0.843 | 0.893 | 0.772 |
|
| 42 |
+
| Owl-4.1-Small-Fine-tuned-200(151M) | 0.943 | 0.850 | 0.747 | 0.858 | 0.894 | 0.802 |
|
| 43 |
+
| CodeBERT-Fine-tuned(125M) | 0.932 | 0.708 | 0.709 | 0.828 | 0.870 | 0.772 |
|
| 44 |
|
| 45 |
---
|
| 46 |
|