zihanliu commited on
Commit
976a60e
·
verified ·
1 Parent(s): 8a06be1

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +19 -0
README.md CHANGED
@@ -77,6 +77,25 @@ Notably, RLHF for alignment, when used as a pre-step, boosts the model’s compl
77
  | ***Tool Calling*** |
78
  | BFCL V3 | 70.4 | 67.9 | 68.6 | 67.5 |
79
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
80
  ## Evaluation Tookit
81
 
82
  To reproduce our results, please check evaluation code, scripts, cached prediction files in https://huggingface.co/nvidia/Nemotron-Cascade-14B-Thinking/blob/main/evaluation/README.md
 
77
  | ***Tool Calling*** |
78
  | BFCL V3 | 70.4 | 67.9 | 68.6 | 67.5 |
79
 
80
+
81
+ ## Usage Recommendations
82
+
83
+ We recommend using RoPE scaling with the [YaRN](https://arxiv.org/abs/2309.00071) method to better support long-context inputs. This can be enabled by updating the model’s `config.json` as shown below:
84
+ ```json
85
+ {
86
+ ...,
87
+ "rope_scaling": {
88
+ "rope_type": "yarn",
89
+ "factor": 2.0,
90
+ "original_max_position_embeddings": 32768
91
+ }
92
+ }
93
+ ```
94
+
95
+ - **Nemotron-Cascade-14B-Thinking**: use `factor: 3.0` to extend the context length to 90K tokens for SWE Verified (Agentless), and `factor: 2.0` to extend the context length to 64K tokens for other benchmarks.
96
+ - **Nemotron-Cascade-8B** and **Nemotron-Cascade-8B-Thinking**: use `factor: 2.0` across all benchmarks.
97
+
98
+
99
  ## Evaluation Tookit
100
 
101
  To reproduce our results, please check evaluation code, scripts, cached prediction files in https://huggingface.co/nvidia/Nemotron-Cascade-14B-Thinking/blob/main/evaluation/README.md