nielsr HF Staff commited on
Commit
f93be80
·
verified ·
1 Parent(s): a8918bb

Improve model card: add metadata, links, and usage info

Browse files

Hi, I'm Niels from the community science team at Hugging Face.

This PR improves the model card for the DECS NRP Detector by:
- Adding `library_name: transformers` and `pipeline_tag: text-generation` to the metadata.
- Adding links to the associated paper, GitHub repository, and project page.
- Including a sample usage section based on the official GitHub repository instructions.
- Updating the citation block to reflect the latest information.

These changes help users discover, understand, and run the model more easily.

Files changed (1) hide show
  1. README.md +29 -12
README.md CHANGED
@@ -1,25 +1,42 @@
1
  ---
2
- license: mit
3
- language:
4
- - en
5
  base_model:
6
  - Qwen/Qwen2.5-1.5B-Instruct
 
 
 
 
 
7
  ---
 
8
  # DECS NRP Detector
9
 
10
- This repository contains the NRP detector model used in the DECS algorithm. It is designed to determine
11
- whether a given reasoning chunk contains the ground truth signal.
 
 
 
 
 
 
 
 
 
 
 
 
 
12
 
13
  ## Citation
14
 
15
- If you use this model, please cite:
16
 
17
  ```bibtex
18
- @inproceedings{jiang2026overthinking,
19
- title={Overthinking Reduction with Decoupled Rewards and Curriculum Data Scheduling},
20
- author={Shuyang Jiang and Yusheng Liao and Ya Zhang and Yanfeng Wang and Yu Wang},
21
- booktitle={The Fourteenth International Conference on Learning Representations},
22
- year={2026},
23
- url={https://openreview.net/forum?id=kdeiRledV6}
 
24
  }
25
  ```
 
1
  ---
 
 
 
2
  base_model:
3
  - Qwen/Qwen2.5-1.5B-Instruct
4
+ language:
5
+ - en
6
+ license: mit
7
+ library_name: transformers
8
+ pipeline_tag: text-generation
9
  ---
10
+
11
  # DECS NRP Detector
12
 
13
+ This repository contains the NRP (Next Reasoning Point) detector model used in the DECS algorithm, as presented in the paper [Overthinking Reduction with Decoupled Rewards and Curriculum Data Scheduling](https://huggingface.co/papers/2509.25827).
14
+
15
+ The NRP detector is designed to determine whether a given reasoning chunk contains the ground truth signal, enabling surgically precise token-level rewards to reduce "overthinking" in reasoning models.
16
+
17
+ - **Project Page:** [https://pixas.github.io/decs-iclr26-site/](https://pixas.github.io/decs-iclr26-site/)
18
+ - **Repository:** [https://github.com/pixas/DECS](https://github.com/pixas/DECS)
19
+ - **Paper:** [arXiv:2509.25827](https://huggingface.co/papers/2509.25827)
20
+
21
+ ## Usage
22
+
23
+ According to the official repository, you can deploy the NRP detector using `vLLM`:
24
+
25
+ ```bash
26
+ vllm serve --model pixas/DECS_NRP_DETECTOR --port 10041
27
+ ```
28
 
29
  ## Citation
30
 
31
+ If you use this model, please cite the following work:
32
 
33
  ```bibtex
34
+ @inproceedings{jiang2026decs,
35
+ title = {Overthinking Reduction with Decoupled Rewards and Curriculum Data Scheduling},
36
+ author = {Jiang, Shuyang and Tao, Xiaofeng and Zhang, Kui and Xiao, Yanghua},
37
+ booktitle = {International Conference on Learning Representations (ICLR)},
38
+ year = {2026},
39
+ note = {Oral},
40
+ url = {https://arxiv.org/abs/2509.25827}
41
  }
42
  ```