Feature Extraction
sentence-transformers
Safetensors
modernbert
code-search
code-embedding
retrieval
dense
text-embeddings-inference
Shuu12121 commited on
Commit
c6d9d28
·
verified ·
1 Parent(s): c30b665

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +16 -4
README.md CHANGED
@@ -10,9 +10,10 @@ tags:
10
  base_model: Shuu12121/NightOwl
11
  pipeline_tag: feature-extraction
12
  library_name: sentence-transformers
 
13
  ---
14
 
15
- # NightOwl-CodeEmbedding 🦉
16
 
17
  `NightOwl-CodeEmbedding` is a 768-dimensional dense embedding model specialized
18
  for code retrieval, code-edit retrieval, and technical question answering. It
@@ -55,13 +56,13 @@ print(scores)
55
  | Query/document prefixes | None |
56
  | Weight dtype | FP32 |
57
  | Weight memory | 575 MiB |
 
58
 
59
  ## MTEB Results
60
 
61
  The model was evaluated using:
62
 
63
  - MTEB version: `2.14.5`
64
- - Model revision: `437f5d1c1aeaf8275507179913d9a3f66c2b0af9`
65
  - Metric: NDCG@10
66
  - Hardware: NVIDIA GeForce RTX 5090
67
 
@@ -91,8 +92,19 @@ bidirectional query-to-document and document-to-query objectives. The generated
91
  training metadata reports 2,534,400 training samples with one positive and
92
  fifteen negatives per anchor.
93
 
94
- The exact public source dataset names and model license are not currently
95
- declared in the repository metadata.
 
 
 
 
 
 
 
 
 
 
 
96
 
97
  ## Limitations
98
 
 
10
  base_model: Shuu12121/NightOwl
11
  pipeline_tag: feature-extraction
12
  library_name: sentence-transformers
13
+ license: apache-2.0
14
  ---
15
 
16
+ # NightOwl CodeEmbedding🦉
17
 
18
  `NightOwl-CodeEmbedding` is a 768-dimensional dense embedding model specialized
19
  for code retrieval, code-edit retrieval, and technical question answering. It
 
56
  | Query/document prefixes | None |
57
  | Weight dtype | FP32 |
58
  | Weight memory | 575 MiB |
59
+ | License | Apache-2.0 |
60
 
61
  ## MTEB Results
62
 
63
  The model was evaluated using:
64
 
65
  - MTEB version: `2.14.5`
 
66
  - Metric: NDCG@10
67
  - Hardware: NVIDIA GeForce RTX 5090
68
 
 
92
  training metadata reports 2,534,400 training samples with one positive and
93
  fifteen negatives per anchor.
94
 
95
+ The training data covers the following MTEB task families:
96
+
97
+ - `AppsRetrieval`
98
+ - `COIRCodeSearchNetRetrieval`
99
+ - `CodeFeedbackMT`
100
+ - `CodeFeedbackST`
101
+ - `CodeSearchNetCCRetrieval`
102
+ - `CodeSearchNetRetrieval`
103
+ - `CodeTransOceanContest`
104
+ - `CodeTransOceanDL`
105
+ - `CosQA`
106
+ - `StackOverflowQA`
107
+ - `SyntheticText2SQL`
108
 
109
  ## Limitations
110