From original readme

base_model: - Qwen/Qwen2.5-7B-Instruct datasets: - nvidia/OpenCodeReasoning language: - en library_name: transformers tags: - nvidia - code pipeline_tag: text-generation

OpenCodeReasoning-Nemotron-1.1-7B Overview

Description:

OpenCodeReasoning-Nemotron-1.1-7B is a large language model (LLM) which is a derivative of Qwen2.5-7B-Instruct (AKA the reference model). It is a reasoning model that is post-trained for reasoning for code generation. The model supports a context length of 64k tokens.

This model is ready for commercial/non-commercial use.

Results

Below results are the average of 64 evaluations on LiveCodeBench (v5) [2408-2501].

Model	Pass@1
DeepSeek-R1-0528	73.4
DeepSeek-R1	65.6
QwQ-32B	61.3

Distilled 7B+ Models

Bespoke-Stratos-7B	14.7
OpenThinker-7B	25.5
R1-Distill-Qwen-7B	38.0
OlympicCoder-7B	40.9
OpenCodeReasoning-Nemotron-7B	51.3
OpenCodeReasoning-Nemotron-1.1-7B	55.5

Distilled 14B+ Models

R1-Distill-Qwen-14B	51.3
OpenCodeReasoning-Nemotron-14B	59.4
OpenCodeReasoning-Nemotron-1.1-14B	65.9

Distilled 32B+ Models

Bespoke-Stratos-32B	30.1
OpenThinker-32B	54.1
R1-Distill-Qwen-32B	58.1
OlympicCoder-32B	57.4
OpenCodeReasoning-Nemotron-32B	61.7
OpenCodeReasoning-Nemotron-1.1-32B	69.9

How to use the models?

To run inference on coding problems:

import transformers
import torch

model_id = "nvidia/OpenCodeReasoning-Nemotron-7B-v1.1"

pipeline = transformers.pipeline(
    "text-generation",
    model=model_id,
    model_kwargs={"torch_dtype": torch.bfloat16},
    device_map="auto",
)

prompt = """You are a helpful and harmless assistant. You should think step-by-step before responding to the instruction below.

Please use python programming language only.

You must use ```python for just the final solution code block with the following format:
```python
# Your code here
```

{user}
"""

messages = [
    {
        "role": "user",
        "content": prompt.format(user="Write a program to calculate the sum of the first $N$ fibonacci numbers")
    },
]

outputs = pipeline(
    messages,
    max_new_tokens=49152,
)
print(outputs[0]["generated_text"][-1]['content'])

Downloads last month: 14

GGUF

Model size

8B params

Architecture

qwen2

Hardware compatibility

1-bit

2-bit

3-bit

4-bit

5-bit

6-bit

8-bit