--- license: cc-by-nc-4.0 --- # CPRetriever-Prob **CPRetriever-Prob** is a sentence embedding model trained for competitive programming problem retrieval. This model can be directly used via the `sentence-transformers` library. Visit https://cpret.online/ to try out **CPRet** in action for competitive programming problem retrieval — powered by the **CPRetriever-Prob** model. ## 🔧 Usage ```python from sentence_transformers import SentenceTransformer model = SentenceTransformer("coldchair16/CPRetriever-Prob") embeddings = model.encode([ "Given a sequence of n numbers, answer m range mex queries.", "求一个长度为 n 的数列的区间 mex。" ]) ``` ## 💡 Applications This model powers the retrieval demo in [CPRet](https://github.com/coldchair/CPRet), supporting several practical use cases: * It can assist in **duplicate problem detection** by retrieving potentially similar problems — final identification still requires manual verification. * It also supports **similar problem retrieval** to help broaden your problem-solving perspective. * You can input either a **full problem description** or a **simplified version**, and the system will return the most relevant existing problems. ## 📚 Training and Evaluation For training pipeline, evaluation benchmark, and retrieval demo, please refer to the full codebase: 👉 [CPRet on GitHub](https://github.com/coldchair/CPRet?tab=readme-ov-file) ## 📦 Model Card * Architecture: `Salesforce/SFR-Embedding-Code-2B_R` (encoder backbone) * Tasks: Contrastive pretraining + task-specific fine-tuning on programming problem pairs * Format: Compatible with `sentence-transformers`