| --- |
| base_model: |
| - Qwen/Qwen2.5-1.5B-Instruct |
| language: |
| - en |
| license: mit |
| library_name: transformers |
| pipeline_tag: text-generation |
| --- |
| |
| # DECS NRP Detector |
|
|
| This repository contains the NRP (Necessary Reasoning Prefix) detector model used in the DECS algorithm, as presented in the paper [Overthinking Reduction with Decoupled Rewards and Curriculum Data Scheduling](https://huggingface.co/papers/2509.25827). |
|
|
| The NRP detector is designed to determine whether a given reasoning chunk contains the ground truth signal, enabling surgically precise token-level rewards to reduce "overthinking" in reasoning models. |
|
|
| - **Project Page:** [https://pixas.github.io/decs-iclr26-site/](https://pixas.github.io/decs-iclr26-site/) |
| - **Repository:** [https://github.com/pixas/DECS](https://github.com/pixas/DECS) |
| - **Paper:** [arXiv:2509.25827](https://huggingface.co/papers/2509.25827) |
|
|
| ## Usage |
|
|
| According to the official repository, you can deploy the NRP detector using `vLLM`: |
|
|
| ```bash |
| vllm serve --model pixas/DECS_NRP_DETECTOR --port 10041 |
| ``` |
|
|
| ## Citation |
|
|
| If you use this model, please cite the following work: |
|
|
| ```bibtex |
| @inproceedings{jiang2026decs, |
| title = {Overthinking Reduction with Decoupled Rewards and Curriculum Data Scheduling}, |
| author = {Jiang, Shuyang and Tao, Xiaofeng and Zhang, Kui and Xiao, Yanghua}, |
| booktitle = {International Conference on Learning Representations (ICLR)}, |
| year = {2026}, |
| note = {Oral}, |
| url = {https://arxiv.org/abs/2509.25827} |
| } |
| ``` |