nvidia
/

Nemotron-Cascade-8B

Text Generation

Nemotron-Cascade

general-purpose

text-generation-inference

Model card Files Files and versions

zihanliu commited on Dec 18, 2025

Commit

3d414df

·

verified ·

1 Parent(s): be323ab

Update README.md

Files changed (1) hide show

README.md +3 -3

README.md CHANGED Viewed

@@ -80,7 +80,7 @@ Notably, RLHF for alignment, when used as a pre-step, boosts the model’s compl
 ## Usage Recommendations
-We recommend using RoPE scaling with the [YaRN](https://arxiv.org/abs/2309.00071) method to better support long-context inputs. This can be enabled by updating the model’s config.json as shown below:
 ```json
   {
     ...,
@@ -92,8 +92,8 @@ We recommend using RoPE scaling with the [YaRN](https://arxiv.org/abs/2309.00071
   }
 ```
-- **Nemotron-Cascade-14B-Thinking**: use ```factor: 3.0``` to extend the context length to 90K tokens for SWE Verified (Agentless), and ```factor: 2.0``` to extend the context length to 64K tokens for other benchmarks.
-- **Nemotron-Cascade-8B** and **Nemotron-Cascade-8B-Thinking**: use ```factor: 2.0``` across all benchmarks.
 ## Evaluation Tookit

 ## Usage Recommendations
+We recommend using RoPE scaling with the [YaRN](https://arxiv.org/abs/2309.00071) method to better support long-context inputs. This can be enabled by updating the model’s `config.json` as shown below:
 ```json
   {
     ...,
   }
 ```
+- **Nemotron-Cascade-14B-Thinking**: use `factor: 3.0` to extend the context length to 90K tokens for SWE Verified (Agentless), and `factor: 2.0` to extend the context length to 64K tokens for other benchmarks.
+- **Nemotron-Cascade-8B** and **Nemotron-Cascade-8B-Thinking**: use `factor: 2.0` across all benchmarks.
 ## Evaluation Tookit