Feature Extraction
sentence-transformers
Safetensors
modernbert
code-search
code-embedding
retrieval
dense
text-embeddings-inference
Instructions to use Shuu12121/NightOwl-CodeEmbedding with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- sentence-transformers
How to use Shuu12121/NightOwl-CodeEmbedding with sentence-transformers:
from sentence_transformers import SentenceTransformer model = SentenceTransformer("Shuu12121/NightOwl-CodeEmbedding") sentences = [ "The weather is lovely today.", "It's so sunny outside!", "He drove to the stadium." ] embeddings = model.encode(sentences) similarities = model.similarity(embeddings, embeddings) print(similarities.shape) # [3, 3] - Notebooks
- Google Colab
- Kaggle
Update README.md
Browse files
README.md
CHANGED
|
@@ -10,9 +10,10 @@ tags:
|
|
| 10 |
base_model: Shuu12121/NightOwl
|
| 11 |
pipeline_tag: feature-extraction
|
| 12 |
library_name: sentence-transformers
|
|
|
|
| 13 |
---
|
| 14 |
|
| 15 |
-
# NightOwl
|
| 16 |
|
| 17 |
`NightOwl-CodeEmbedding` is a 768-dimensional dense embedding model specialized
|
| 18 |
for code retrieval, code-edit retrieval, and technical question answering. It
|
|
@@ -55,13 +56,13 @@ print(scores)
|
|
| 55 |
| Query/document prefixes | None |
|
| 56 |
| Weight dtype | FP32 |
|
| 57 |
| Weight memory | 575 MiB |
|
|
|
|
| 58 |
|
| 59 |
## MTEB Results
|
| 60 |
|
| 61 |
The model was evaluated using:
|
| 62 |
|
| 63 |
- MTEB version: `2.14.5`
|
| 64 |
-
- Model revision: `437f5d1c1aeaf8275507179913d9a3f66c2b0af9`
|
| 65 |
- Metric: NDCG@10
|
| 66 |
- Hardware: NVIDIA GeForce RTX 5090
|
| 67 |
|
|
@@ -91,8 +92,19 @@ bidirectional query-to-document and document-to-query objectives. The generated
|
|
| 91 |
training metadata reports 2,534,400 training samples with one positive and
|
| 92 |
fifteen negatives per anchor.
|
| 93 |
|
| 94 |
-
The
|
| 95 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 96 |
|
| 97 |
## Limitations
|
| 98 |
|
|
|
|
| 10 |
base_model: Shuu12121/NightOwl
|
| 11 |
pipeline_tag: feature-extraction
|
| 12 |
library_name: sentence-transformers
|
| 13 |
+
license: apache-2.0
|
| 14 |
---
|
| 15 |
|
| 16 |
+
# NightOwl CodeEmbedding🦉
|
| 17 |
|
| 18 |
`NightOwl-CodeEmbedding` is a 768-dimensional dense embedding model specialized
|
| 19 |
for code retrieval, code-edit retrieval, and technical question answering. It
|
|
|
|
| 56 |
| Query/document prefixes | None |
|
| 57 |
| Weight dtype | FP32 |
|
| 58 |
| Weight memory | 575 MiB |
|
| 59 |
+
| License | Apache-2.0 |
|
| 60 |
|
| 61 |
## MTEB Results
|
| 62 |
|
| 63 |
The model was evaluated using:
|
| 64 |
|
| 65 |
- MTEB version: `2.14.5`
|
|
|
|
| 66 |
- Metric: NDCG@10
|
| 67 |
- Hardware: NVIDIA GeForce RTX 5090
|
| 68 |
|
|
|
|
| 92 |
training metadata reports 2,534,400 training samples with one positive and
|
| 93 |
fifteen negatives per anchor.
|
| 94 |
|
| 95 |
+
The training data covers the following MTEB task families:
|
| 96 |
+
|
| 97 |
+
- `AppsRetrieval`
|
| 98 |
+
- `COIRCodeSearchNetRetrieval`
|
| 99 |
+
- `CodeFeedbackMT`
|
| 100 |
+
- `CodeFeedbackST`
|
| 101 |
+
- `CodeSearchNetCCRetrieval`
|
| 102 |
+
- `CodeSearchNetRetrieval`
|
| 103 |
+
- `CodeTransOceanContest`
|
| 104 |
+
- `CodeTransOceanDL`
|
| 105 |
+
- `CosQA`
|
| 106 |
+
- `StackOverflowQA`
|
| 107 |
+
- `SyntheticText2SQL`
|
| 108 |
|
| 109 |
## Limitations
|
| 110 |
|