|
|
--- |
|
|
license: cc-by-nc-4.0 |
|
|
--- |
|
|
|
|
|
|
|
|
# CPRetriever-Prob |
|
|
|
|
|
**CPRetriever-Prob** is a sentence embedding model trained for competitive programming problem retrieval. |
|
|
|
|
|
This model can be directly used via the `sentence-transformers` library. |
|
|
|
|
|
Visit https://cpret.online/ to try out **CPRet** in action for competitive programming problem retrieval — powered by the **CPRetriever-Prob** model. |
|
|
|
|
|
## 🔧 Usage |
|
|
|
|
|
```python |
|
|
from sentence_transformers import SentenceTransformer |
|
|
|
|
|
model = SentenceTransformer("coldchair16/CPRetriever-Prob") |
|
|
embeddings = model.encode([ |
|
|
"Given a sequence of n numbers, answer m range mex queries.", |
|
|
"求一个长度为 n 的数列的区间 mex。" |
|
|
]) |
|
|
``` |
|
|
|
|
|
|
|
|
|
|
|
## 💡 Applications |
|
|
|
|
|
This model powers the retrieval demo in [CPRet](https://github.com/coldchair/CPRet), supporting several practical use cases: |
|
|
|
|
|
* It can assist in **duplicate problem detection** by retrieving potentially similar problems — final identification still requires manual verification. |
|
|
* It also supports **similar problem retrieval** to help broaden your problem-solving perspective. |
|
|
* You can input either a **full problem description** or a **simplified version**, and the system will return the most relevant existing problems. |
|
|
|
|
|
## 📚 Training and Evaluation |
|
|
|
|
|
For training pipeline, evaluation benchmark, and retrieval demo, please refer to the full codebase: |
|
|
👉 [CPRet on GitHub](https://github.com/coldchair/CPRet?tab=readme-ov-file) |
|
|
|
|
|
## 📦 Model Card |
|
|
|
|
|
* Architecture: `Salesforce/SFR-Embedding-Code-2B_R` (encoder backbone) |
|
|
* Tasks: Contrastive pretraining + task-specific fine-tuning on programming problem pairs |
|
|
* Format: Compatible with `sentence-transformers` |
|
|
|
|
|
|
|
|
|