zihanliu commited on
Commit
3d414df
·
verified ·
1 Parent(s): be323ab

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -3
README.md CHANGED
@@ -80,7 +80,7 @@ Notably, RLHF for alignment, when used as a pre-step, boosts the model’s compl
80
 
81
  ## Usage Recommendations
82
 
83
- We recommend using RoPE scaling with the [YaRN](https://arxiv.org/abs/2309.00071) method to better support long-context inputs. This can be enabled by updating the model’s config.json as shown below:
84
  ```json
85
  {
86
  ...,
@@ -92,8 +92,8 @@ We recommend using RoPE scaling with the [YaRN](https://arxiv.org/abs/2309.00071
92
  }
93
  ```
94
 
95
- - **Nemotron-Cascade-14B-Thinking**: use ```factor: 3.0``` to extend the context length to 90K tokens for SWE Verified (Agentless), and ```factor: 2.0``` to extend the context length to 64K tokens for other benchmarks.
96
- - **Nemotron-Cascade-8B** and **Nemotron-Cascade-8B-Thinking**: use ```factor: 2.0``` across all benchmarks.
97
 
98
 
99
  ## Evaluation Tookit
 
80
 
81
  ## Usage Recommendations
82
 
83
+ We recommend using RoPE scaling with the [YaRN](https://arxiv.org/abs/2309.00071) method to better support long-context inputs. This can be enabled by updating the model’s `config.json` as shown below:
84
  ```json
85
  {
86
  ...,
 
92
  }
93
  ```
94
 
95
+ - **Nemotron-Cascade-14B-Thinking**: use `factor: 3.0` to extend the context length to 90K tokens for SWE Verified (Agentless), and `factor: 2.0` to extend the context length to 64K tokens for other benchmarks.
96
+ - **Nemotron-Cascade-8B** and **Nemotron-Cascade-8B-Thinking**: use `factor: 2.0` across all benchmarks.
97
 
98
 
99
  ## Evaluation Tookit