File size: 1,679 Bytes
7e1ff28
 
 
 
 
 
 
 
 
 
 
a6f85ad
31fb60f
7e1ff28
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
---
license: cc-by-nc-4.0
---


# CPRetriever-Prob

**CPRetriever-Prob** is a sentence embedding model trained for competitive programming problem retrieval.

This model can be directly used via the `sentence-transformers` library.

Visit https://cpret.online/ to try out **CPRet** in action for competitive programming problem retrieval — powered by the **CPRetriever-Prob** model.

## 🔧 Usage

```python
from sentence_transformers import SentenceTransformer

model = SentenceTransformer("coldchair16/CPRetriever-Prob")
embeddings = model.encode([
    "Given a sequence of n numbers, answer m range mex queries.",
    "求一个长度为 n 的数列的区间 mex。"
])
```



## 💡 Applications

This model powers the retrieval demo in [CPRet](https://github.com/coldchair/CPRet), supporting several practical use cases:

* It can assist in **duplicate problem detection** by retrieving potentially similar problems — final identification still requires manual verification.
* It also supports **similar problem retrieval** to help broaden your problem-solving perspective.
* You can input either a **full problem description** or a **simplified version**, and the system will return the most relevant existing problems.

## 📚 Training and Evaluation

For training pipeline, evaluation benchmark, and retrieval demo, please refer to the full codebase:
👉 [CPRet on GitHub](https://github.com/coldchair/CPRet?tab=readme-ov-file)

## 📦 Model Card

* Architecture: `Salesforce/SFR-Embedding-Code-2B_R` (encoder backbone)
* Tasks: Contrastive pretraining + task-specific fine-tuning on programming problem pairs
* Format: Compatible with `sentence-transformers`