Sentence Similarity
Safetensors
sentence-transformers
English
PyLate
modernbert
ColBERT
feature-extraction
code-search
knowledge-distillation
apple-silicon
mps
text-embeddings-inference
Instructions to use ctrltokyo/ColBERT-Zero-6L-CodeSearch with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- sentence-transformers
How to use ctrltokyo/ColBERT-Zero-6L-CodeSearch with sentence-transformers:
from pylate import models queries = [ "Which planet is known as the Red Planet?", "What is the largest planet in our solar system?", ] documents = [ ["Mars is the Red Planet.", "Venus is Earth's twin."], ["Jupiter is the largest planet.", "Saturn has rings."], ] model = models.ColBERT(model_name_or_path="ctrltokyo/ColBERT-Zero-6L-CodeSearch") queries_emb = model.encode(queries, is_query=True) docs_emb = model.encode(documents, is_query=False) - Notebooks
- Google Colab
- Kaggle
Update README.md
Browse files
README.md
CHANGED
|
@@ -39,7 +39,7 @@ A **6-layer ColBERT model** distilled from [ColBERT-Zero](https://huggingface.co
|
|
| 39 |
|
| 40 |
## Benchmark Results
|
| 41 |
|
| 42 |
-
Evaluated on 3 code search corpora (150 questions total) via [litembeddings](https://github.com/
|
| 43 |
|
| 44 |
| Corpus | Teacher MRR | Student MRR | % of Teacher | Student Query Speed |
|
| 45 |
|--------|------------|-------------|--------------|---------------------|
|
|
@@ -145,7 +145,7 @@ reranked = rank.rerank(
|
|
| 145 |
|
| 146 |
## GGUF / litembeddings
|
| 147 |
|
| 148 |
-
This model can be converted to GGUF format for use with [litembeddings](https://github.com/
|
| 149 |
|
| 150 |
```bash
|
| 151 |
# Convert to GGUF
|
|
@@ -191,5 +191,5 @@ SELECT lembed_maxsim(
|
|
| 191 |
|
| 192 |
- [ColBERT-Zero](https://huggingface.co/lightonai/ColBERT-Zero) by LightOn AI — the teacher model
|
| 193 |
- [PyLate](https://github.com/lightonai/pylate) — ColBERT training framework
|
| 194 |
-
- [litembeddings](https://github.com/
|
| 195 |
- Training and experimentation performed entirely on Apple Silicon (M4 Max) using PyTorch MPS backend
|
|
|
|
| 39 |
|
| 40 |
## Benchmark Results
|
| 41 |
|
| 42 |
+
Evaluated on 3 code search corpora (150 questions total) via [litembeddings](https://github.com/alexandernicholson/litembeddings):
|
| 43 |
|
| 44 |
| Corpus | Teacher MRR | Student MRR | % of Teacher | Student Query Speed |
|
| 45 |
|--------|------------|-------------|--------------|---------------------|
|
|
|
|
| 145 |
|
| 146 |
## GGUF / litembeddings
|
| 147 |
|
| 148 |
+
This model can be converted to GGUF format for use with [litembeddings](https://github.com/alexandernicholson/litembeddings) (SQLite-based embedding engine with SIMD-accelerated MaxSim):
|
| 149 |
|
| 150 |
```bash
|
| 151 |
# Convert to GGUF
|
|
|
|
| 191 |
|
| 192 |
- [ColBERT-Zero](https://huggingface.co/lightonai/ColBERT-Zero) by LightOn AI — the teacher model
|
| 193 |
- [PyLate](https://github.com/lightonai/pylate) — ColBERT training framework
|
| 194 |
+
- [litembeddings](https://github.com/alexandernicholson/litembeddings) — SQLite embedding engine used for benchmarking
|
| 195 |
- Training and experimentation performed entirely on Apple Silicon (M4 Max) using PyTorch MPS backend
|