Shuu12121
/

CodeModernBERT-Finch

@@ -17,29 +17,30 @@ tags:
 # CodeModernBERT-Finch
-このモデルはCodeSearchNetのみを用いて作成されたコード向けの事前学習済みモデルです。CodeSearchNetの６言語に対応しています。
-##  Architecture
-- Base: ModernBERT-style encoder
-- Hidden size: 512
-- Layers: 6
-- Attention heads: 6
-- Parameters: ~50M
-- Pretraining: Masked Language Modeling (MLM)
-- Fine-tuning: Domain-specific code tasks
-以下の結果はCodeSearchNetのデータセットを各言語ごとに10000件をランダムにサンプリングしSentence-BERT形式で学習を行い、MTEBのCodeSearchNetRetrivalで評価した結果です。
-この表のモデルはそれぞれの事前学習済みモデルを同様の手法でファインチューニングを行っています(200と表記があるものとFinch系統はMultiple Negatice Ranking Lossのバッチサイズを200としています。それ以外は40で実行しました。（そもそもメモリに乗らなかった）)
-| Model                        |    go |   java |   javascript |   php |   python |   ruby |
-|------------------------------|-------|--------|--------------|-------|----------|--------|
-| Finch(40M)                   | 0.934 |  0.784 |        0.728 | 0.835 |    0.865 |  0.756 |
-| Finch-Pre(40M)                   | 0.937 |  0.705 |        0.685 | 0.828 |    0.843 |  0.725 |
-| base-Finetuned(149M)                | 0.933 |  0.779 |        0.748 | 0.839 |    0.885 |  0.794 |
-| Owl-4.1-Small-Fine-tuned(151M)      | 0.942 |  0.780 |        0.729 | 0.843 |    0.893 |  0.772 |
-| Owl-4.1-Small-Fine-tuned-200(151M) | 0.943 |  0.850 |        0.747 | 0.858 |    0.894 |  0.802 |
-| CodeBERT-Fine-tuned(125M)          | 0.932 |  0.708 |        0.709 | 0.828 |    0.870 |  0.772 |
 ---

 # CodeModernBERT-Finch
+This model is a code-specific pretrained model created solely using the CodeSearchNet dataset. It supports six languages included in CodeSearchNet.\
+For a version fine-tuned specifically for code search tasks, please refer to [Shuu12121/CodeSearch-ModernBERT-Finch](https://huggingface.co/Shuu12121/CodeSearch-ModernBERT-Finch).
+## Architecture
+* Base: ModernBERT-style encoder
+* Hidden size: 512
+* Layers: 6
+* Attention heads: 6
+* Parameters: \~50M
+* Pretraining: Masked Language Modeling (MLM)
+* Fine-tuning: Domain-specific code tasks
+The results below were obtained by randomly sampling 10,000 examples per language from the CodeSearchNet dataset, training them in a Sentence-BERT fashion, and evaluating on the MTEB CodeSearchNetRetrieval benchmark.
+All models listed in the table below were fine-tuned using the same approach. Those marked with 200 and the Finch models were trained with a Multiple Negatives Ranking Loss batch size of 200. Others were trained with a batch size of 40 (because larger batches could not fit into memory).
+| Model                              | go    | java  | javascript | php   | python | ruby  |
+| ---------------------------------- | ----- | ----- | ---------- | ----- | ------ | ----- |
+| Finch(40M)                         | 0.934 | 0.784 | 0.728      | 0.835 | 0.865  | 0.756 |
+| Finch-Pre(40M)                     | 0.937 | 0.705 | 0.685      | 0.828 | 0.843  | 0.725 |
+| base-Finetuned(149M)               | 0.933 | 0.779 | 0.748      | 0.839 | 0.885  | 0.794 |
+| Owl-4.1-Small-Fine-tuned(151M)     | 0.942 | 0.780 | 0.729      | 0.843 | 0.893  | 0.772 |
+| Owl-4.1-Small-Fine-tuned-200(151M) | 0.943 | 0.850 | 0.747      | 0.858 | 0.894  | 0.802 |
+| CodeBERT-Fine-tuned(125M)          | 0.932 | 0.708 | 0.709      | 0.828 | 0.870  | 0.772 |
 ---