Instructions to use Mungert/OpenCodeReasoning-Nemotron-32B-GGUF with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use Mungert/OpenCodeReasoning-Nemotron-32B-GGUF with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="Mungert/OpenCodeReasoning-Nemotron-32B-GGUF") messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("Mungert/OpenCodeReasoning-Nemotron-32B-GGUF", dtype="auto") - llama-cpp-python
How to use Mungert/OpenCodeReasoning-Nemotron-32B-GGUF with llama-cpp-python:
# !pip install llama-cpp-python from llama_cpp import Llama llm = Llama.from_pretrained( repo_id="Mungert/OpenCodeReasoning-Nemotron-32B-GGUF", filename="OpenCodeReasoning-Nemotron-32B-bf16_q8_0.gguf", )
llm.create_chat_completion( messages = [ { "role": "user", "content": "What is the capital of France?" } ] ) - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- llama.cpp
How to use Mungert/OpenCodeReasoning-Nemotron-32B-GGUF with llama.cpp:
Install from brew
brew install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama-server -hf Mungert/OpenCodeReasoning-Nemotron-32B-GGUF:Q4_K_M # Run inference directly in the terminal: llama-cli -hf Mungert/OpenCodeReasoning-Nemotron-32B-GGUF:Q4_K_M
Install from WinGet (Windows)
winget install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama-server -hf Mungert/OpenCodeReasoning-Nemotron-32B-GGUF:Q4_K_M # Run inference directly in the terminal: llama-cli -hf Mungert/OpenCodeReasoning-Nemotron-32B-GGUF:Q4_K_M
Use pre-built binary
# Download pre-built binary from: # https://github.com/ggerganov/llama.cpp/releases # Start a local OpenAI-compatible server with a web UI: ./llama-server -hf Mungert/OpenCodeReasoning-Nemotron-32B-GGUF:Q4_K_M # Run inference directly in the terminal: ./llama-cli -hf Mungert/OpenCodeReasoning-Nemotron-32B-GGUF:Q4_K_M
Build from source code
git clone https://github.com/ggerganov/llama.cpp.git cd llama.cpp cmake -B build cmake --build build -j --target llama-server llama-cli # Start a local OpenAI-compatible server with a web UI: ./build/bin/llama-server -hf Mungert/OpenCodeReasoning-Nemotron-32B-GGUF:Q4_K_M # Run inference directly in the terminal: ./build/bin/llama-cli -hf Mungert/OpenCodeReasoning-Nemotron-32B-GGUF:Q4_K_M
Use Docker
docker model run hf.co/Mungert/OpenCodeReasoning-Nemotron-32B-GGUF:Q4_K_M
- LM Studio
- Jan
- vLLM
How to use Mungert/OpenCodeReasoning-Nemotron-32B-GGUF with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "Mungert/OpenCodeReasoning-Nemotron-32B-GGUF" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Mungert/OpenCodeReasoning-Nemotron-32B-GGUF", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/Mungert/OpenCodeReasoning-Nemotron-32B-GGUF:Q4_K_M
- SGLang
How to use Mungert/OpenCodeReasoning-Nemotron-32B-GGUF with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "Mungert/OpenCodeReasoning-Nemotron-32B-GGUF" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Mungert/OpenCodeReasoning-Nemotron-32B-GGUF", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "Mungert/OpenCodeReasoning-Nemotron-32B-GGUF" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Mungert/OpenCodeReasoning-Nemotron-32B-GGUF", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Ollama
How to use Mungert/OpenCodeReasoning-Nemotron-32B-GGUF with Ollama:
ollama run hf.co/Mungert/OpenCodeReasoning-Nemotron-32B-GGUF:Q4_K_M
- Unsloth Studio
How to use Mungert/OpenCodeReasoning-Nemotron-32B-GGUF with Unsloth Studio:
Install Unsloth Studio (macOS, Linux, WSL)
curl -fsSL https://unsloth.ai/install.sh | sh # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for Mungert/OpenCodeReasoning-Nemotron-32B-GGUF to start chatting
Install Unsloth Studio (Windows)
irm https://unsloth.ai/install.ps1 | iex # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for Mungert/OpenCodeReasoning-Nemotron-32B-GGUF to start chatting
Using HuggingFace Spaces for Unsloth
# No setup required # Open https://huggingface.co/spaces/unsloth/studio in your browser # Search for Mungert/OpenCodeReasoning-Nemotron-32B-GGUF to start chatting
- Pi
How to use Mungert/OpenCodeReasoning-Nemotron-32B-GGUF with Pi:
Start the llama.cpp server
# Install llama.cpp: brew install llama.cpp # Start a local OpenAI-compatible server: llama-server -hf Mungert/OpenCodeReasoning-Nemotron-32B-GGUF:Q4_K_M
Configure the model in Pi
# Install Pi: npm install -g @mariozechner/pi-coding-agent # Add to ~/.pi/agent/models.json: { "providers": { "llama-cpp": { "baseUrl": "http://localhost:8080/v1", "api": "openai-completions", "apiKey": "none", "models": [ { "id": "Mungert/OpenCodeReasoning-Nemotron-32B-GGUF:Q4_K_M" } ] } } }Run Pi
# Start Pi in your project directory: pi
- Hermes Agent new
How to use Mungert/OpenCodeReasoning-Nemotron-32B-GGUF with Hermes Agent:
Start the llama.cpp server
# Install llama.cpp: brew install llama.cpp # Start a local OpenAI-compatible server: llama-server -hf Mungert/OpenCodeReasoning-Nemotron-32B-GGUF:Q4_K_M
Configure Hermes
# Install Hermes: curl -fsSL https://hermes-agent.nousresearch.com/install.sh | bash hermes setup # Point Hermes at the local server: hermes config set model.provider custom hermes config set model.base_url http://127.0.0.1:8080/v1 hermes config set model.default Mungert/OpenCodeReasoning-Nemotron-32B-GGUF:Q4_K_M
Run Hermes
hermes
- Docker Model Runner
How to use Mungert/OpenCodeReasoning-Nemotron-32B-GGUF with Docker Model Runner:
docker model run hf.co/Mungert/OpenCodeReasoning-Nemotron-32B-GGUF:Q4_K_M
- Lemonade
How to use Mungert/OpenCodeReasoning-Nemotron-32B-GGUF with Lemonade:
Pull the model
# Download Lemonade from https://lemonade-server.ai/ lemonade pull Mungert/OpenCodeReasoning-Nemotron-32B-GGUF:Q4_K_M
Run and chat with the model
lemonade run user.OpenCodeReasoning-Nemotron-32B-GGUF-Q4_K_M
List all available models
lemonade list
Super-squash history to reclaim storage
Browse files- .gitattributes +73 -0
- OpenCodeReasoning-Nemotron-32B-bf16_q8_0.gguf +3 -0
- OpenCodeReasoning-Nemotron-32B-f16_q8_0.gguf +3 -0
- OpenCodeReasoning-Nemotron-32B-iq1_m.gguf +3 -0
- OpenCodeReasoning-Nemotron-32B-iq1_s.gguf +3 -0
- OpenCodeReasoning-Nemotron-32B-iq2_xs.gguf +3 -0
- OpenCodeReasoning-Nemotron-32B-iq2_xxs.gguf +3 -0
- OpenCodeReasoning-Nemotron-32B-q2_k_m.gguf +3 -0
- OpenCodeReasoning-Nemotron-32B-q2_k_s.gguf +3 -0
- OpenCodeReasoning-Nemotron-32B-q3_k_m.gguf +3 -0
- OpenCodeReasoning-Nemotron-32B-q3_k_s.gguf +3 -0
- OpenCodeReasoning-Nemotron-32B-q4_0.gguf +3 -0
- OpenCodeReasoning-Nemotron-32B-q4_1.gguf +3 -0
- OpenCodeReasoning-Nemotron-32B-q4_k_m.gguf +3 -0
- OpenCodeReasoning-Nemotron-32B-q4_k_s.gguf +3 -0
- OpenCodeReasoning-Nemotron-32B-q5_0.gguf +3 -0
- OpenCodeReasoning-Nemotron-32B-q5_1.gguf +3 -0
- OpenCodeReasoning-Nemotron-32B-q5_k_m.gguf +3 -0
- OpenCodeReasoning-Nemotron-32B-q5_k_s.gguf +3 -0
- OpenCodeReasoning-Nemotron-32B-q6_k_m.gguf +3 -0
- OpenCodeReasoning-Nemotron-32B-q8_0.gguf +3 -0
- OpenCodeReasoning-Nemotron-32B.imatrix +3 -0
- README.md +202 -0
- bf16/OpenCodeReasoning-Nemotron-32B-bf16-00001-of-00002.gguf +3 -0
- bf16/OpenCodeReasoning-Nemotron-32B-bf16-00002-of-00002.gguf +3 -0
- f16/OpenCodeReasoning-Nemotron-32B-f16-00001-of-00002.gguf +3 -0
- f16/OpenCodeReasoning-Nemotron-32B-f16-00002-of-00002.gguf +3 -0
|
@@ -0,0 +1,73 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
*.7z filter=lfs diff=lfs merge=lfs -text
|
| 2 |
+
*.arrow filter=lfs diff=lfs merge=lfs -text
|
| 3 |
+
*.bin filter=lfs diff=lfs merge=lfs -text
|
| 4 |
+
*.bz2 filter=lfs diff=lfs merge=lfs -text
|
| 5 |
+
*.ckpt filter=lfs diff=lfs merge=lfs -text
|
| 6 |
+
*.ftz filter=lfs diff=lfs merge=lfs -text
|
| 7 |
+
*.gz filter=lfs diff=lfs merge=lfs -text
|
| 8 |
+
*.h5 filter=lfs diff=lfs merge=lfs -text
|
| 9 |
+
*.joblib filter=lfs diff=lfs merge=lfs -text
|
| 10 |
+
*.lfs.* filter=lfs diff=lfs merge=lfs -text
|
| 11 |
+
*.mlmodel filter=lfs diff=lfs merge=lfs -text
|
| 12 |
+
*.model filter=lfs diff=lfs merge=lfs -text
|
| 13 |
+
*.msgpack filter=lfs diff=lfs merge=lfs -text
|
| 14 |
+
*.npy filter=lfs diff=lfs merge=lfs -text
|
| 15 |
+
*.npz filter=lfs diff=lfs merge=lfs -text
|
| 16 |
+
*.onnx filter=lfs diff=lfs merge=lfs -text
|
| 17 |
+
*.ot filter=lfs diff=lfs merge=lfs -text
|
| 18 |
+
*.parquet filter=lfs diff=lfs merge=lfs -text
|
| 19 |
+
*.pb filter=lfs diff=lfs merge=lfs -text
|
| 20 |
+
*.pickle filter=lfs diff=lfs merge=lfs -text
|
| 21 |
+
*.pkl filter=lfs diff=lfs merge=lfs -text
|
| 22 |
+
*.pt filter=lfs diff=lfs merge=lfs -text
|
| 23 |
+
*.pth filter=lfs diff=lfs merge=lfs -text
|
| 24 |
+
*.rar filter=lfs diff=lfs merge=lfs -text
|
| 25 |
+
*.safetensors filter=lfs diff=lfs merge=lfs -text
|
| 26 |
+
saved_model/**/* filter=lfs diff=lfs merge=lfs -text
|
| 27 |
+
*.tar.* filter=lfs diff=lfs merge=lfs -text
|
| 28 |
+
*.tar filter=lfs diff=lfs merge=lfs -text
|
| 29 |
+
*.tflite filter=lfs diff=lfs merge=lfs -text
|
| 30 |
+
*.tgz filter=lfs diff=lfs merge=lfs -text
|
| 31 |
+
*.wasm filter=lfs diff=lfs merge=lfs -text
|
| 32 |
+
*.xz filter=lfs diff=lfs merge=lfs -text
|
| 33 |
+
*.zip filter=lfs diff=lfs merge=lfs -text
|
| 34 |
+
*.zst filter=lfs diff=lfs merge=lfs -text
|
| 35 |
+
*tfevents* filter=lfs diff=lfs merge=lfs -text
|
| 36 |
+
f16/OpenCodeReasoning-Nemotron-32B-f16-00001-of-00002.gguf filter=lfs diff=lfs merge=lfs -text
|
| 37 |
+
f16/OpenCodeReasoning-Nemotron-32B-f16-00002-of-00002.gguf filter=lfs diff=lfs merge=lfs -text
|
| 38 |
+
OpenCodeReasoning-Nemotron-32B-f16_q8_0.gguf filter=lfs diff=lfs merge=lfs -text
|
| 39 |
+
OpenCodeReasoning-Nemotron-32B-bf16_q8_0.gguf filter=lfs diff=lfs merge=lfs -text
|
| 40 |
+
OpenCodeReasoning-Nemotron-32B-f16_q6_k.gguf filter=lfs diff=lfs merge=lfs -text
|
| 41 |
+
OpenCodeReasoning-Nemotron-32B-bf16_q6_k.gguf filter=lfs diff=lfs merge=lfs -text
|
| 42 |
+
OpenCodeReasoning-Nemotron-32B-f16_q4_k.gguf filter=lfs diff=lfs merge=lfs -text
|
| 43 |
+
OpenCodeReasoning-Nemotron-32B-bf16_q4_k.gguf filter=lfs diff=lfs merge=lfs -text
|
| 44 |
+
OpenCodeReasoning-Nemotron-32B-q2_k_l.gguf filter=lfs diff=lfs merge=lfs -text
|
| 45 |
+
OpenCodeReasoning-Nemotron-32B-q3_k_l.gguf filter=lfs diff=lfs merge=lfs -text
|
| 46 |
+
OpenCodeReasoning-Nemotron-32B-q4_k_l.gguf filter=lfs diff=lfs merge=lfs -text
|
| 47 |
+
OpenCodeReasoning-Nemotron-32B-q5_k_l.gguf filter=lfs diff=lfs merge=lfs -text
|
| 48 |
+
OpenCodeReasoning-Nemotron-32B-q6_k_l.gguf filter=lfs diff=lfs merge=lfs -text
|
| 49 |
+
OpenCodeReasoning-Nemotron-32B-q2_k_s.gguf filter=lfs diff=lfs merge=lfs -text
|
| 50 |
+
OpenCodeReasoning-Nemotron-32B-q3_k_m.gguf filter=lfs diff=lfs merge=lfs -text
|
| 51 |
+
OpenCodeReasoning-Nemotron-32B-q3_k_s.gguf filter=lfs diff=lfs merge=lfs -text
|
| 52 |
+
OpenCodeReasoning-Nemotron-32B-q4_k_m.gguf filter=lfs diff=lfs merge=lfs -text
|
| 53 |
+
OpenCodeReasoning-Nemotron-32B-q4_k_s.gguf filter=lfs diff=lfs merge=lfs -text
|
| 54 |
+
OpenCodeReasoning-Nemotron-32B-q5_k_m.gguf filter=lfs diff=lfs merge=lfs -text
|
| 55 |
+
OpenCodeReasoning-Nemotron-32B-q5_k_s.gguf filter=lfs diff=lfs merge=lfs -text
|
| 56 |
+
OpenCodeReasoning-Nemotron-32B-q6_k_m.gguf filter=lfs diff=lfs merge=lfs -text
|
| 57 |
+
OpenCodeReasoning-Nemotron-32B-q8_0.gguf filter=lfs diff=lfs merge=lfs -text
|
| 58 |
+
OpenCodeReasoning-Nemotron-32B-q4_0.gguf filter=lfs diff=lfs merge=lfs -text
|
| 59 |
+
OpenCodeReasoning-Nemotron-32B-q4_1.gguf filter=lfs diff=lfs merge=lfs -text
|
| 60 |
+
OpenCodeReasoning-Nemotron-32B-q4_0_l.gguf filter=lfs diff=lfs merge=lfs -text
|
| 61 |
+
OpenCodeReasoning-Nemotron-32B-q4_1_l.gguf filter=lfs diff=lfs merge=lfs -text
|
| 62 |
+
OpenCodeReasoning-Nemotron-32B-q5_0.gguf filter=lfs diff=lfs merge=lfs -text
|
| 63 |
+
OpenCodeReasoning-Nemotron-32B-q5_1.gguf filter=lfs diff=lfs merge=lfs -text
|
| 64 |
+
OpenCodeReasoning-Nemotron-32B-q5_0_l.gguf filter=lfs diff=lfs merge=lfs -text
|
| 65 |
+
OpenCodeReasoning-Nemotron-32B-q5_1_l.gguf filter=lfs diff=lfs merge=lfs -text
|
| 66 |
+
OpenCodeReasoning-Nemotron-32B-iq1_s.gguf filter=lfs diff=lfs merge=lfs -text
|
| 67 |
+
OpenCodeReasoning-Nemotron-32B-iq1_m.gguf filter=lfs diff=lfs merge=lfs -text
|
| 68 |
+
OpenCodeReasoning-Nemotron-32B-iq2_xs.gguf filter=lfs diff=lfs merge=lfs -text
|
| 69 |
+
OpenCodeReasoning-Nemotron-32B-iq2_xxs.gguf filter=lfs diff=lfs merge=lfs -text
|
| 70 |
+
OpenCodeReasoning-Nemotron-32B.imatrix filter=lfs diff=lfs merge=lfs -text
|
| 71 |
+
bf16/OpenCodeReasoning-Nemotron-32B-bf16-00001-of-00002.gguf filter=lfs diff=lfs merge=lfs -text
|
| 72 |
+
bf16/OpenCodeReasoning-Nemotron-32B-bf16-00002-of-00002.gguf filter=lfs diff=lfs merge=lfs -text
|
| 73 |
+
OpenCodeReasoning-Nemotron-32B-q2_k_m.gguf filter=lfs diff=lfs merge=lfs -text
|
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:f432cc2ef44f3bb1b81cc7bd6f688a272fa6226cf2651f9dfcc512b6f6655e20
|
| 3 |
+
size 46661602304
|
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:bb9f2cde469f8dc33c50ca9d05b93c3d256868c7c8c43d65fc1911893ccae0d5
|
| 3 |
+
size 36280699904
|
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:e996d801e0037195bea4623b9bcc43b1b0f1c3161dcef3dbe644d16e9d4033bf
|
| 3 |
+
size 9742306592
|
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:a921ec5f24446ba4d39d97b8b4952eac93bc1912f966e75c7ca1da7c7de836ef
|
| 3 |
+
size 8992492832
|
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:fa663d4683f1a1ec18cadda2925f82d0b2e3c9a7357d5fefb9200e96150c4a34
|
| 3 |
+
size 11016326432
|
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:d236194cde966eb4b88f71035dad3ed4acc578abb279995f0b8c952577f179ab
|
| 3 |
+
size 10124954912
|
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:65534d66507efcf0905478eb98f2778c3a4a49a97a26d21ef6f3cd2bd920a349
|
| 3 |
+
size 12573011232
|
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:7842c97c1639f8a320fbe24ded3c2c63e577c51b3edc8cfa2eeb0c0041168e9a
|
| 3 |
+
size 11547413792
|
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:4ca41bdfd0ef7f43c3adf17dd177563062163e7c64c0841913772a4d38e0d5c0
|
| 3 |
+
size 16091393312
|
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:9d5a266830d98030ef046fc9306df92d49b9347bd7d84a7bea758e3dbdd4269c
|
| 3 |
+
size 14735658272
|
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:031d643e792d91dc3bd57e5d828624e4c94252b794d30db37bd2209a3565baeb
|
| 3 |
+
size 18439507232
|
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:43c290d9981ca1cc0c0b418ef838fccc6e7c29b1f40eaf6c49d46ff62eac0291
|
| 3 |
+
size 20487179552
|
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:0424f84d4c64113eb27e14dd899a69cff903624f9dcd1afba94978ecc10d2cdd
|
| 3 |
+
size 19824979232
|
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:a4cdc67db6e4d9e52d9ab9c956c581dc54a646d3c2a526914e15c0c5fdb81635
|
| 3 |
+
size 19161427232
|
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:a0efcd01db274e9e45972b47196711a7895d3a5d7a717fd4901be66953a8f13e
|
| 3 |
+
size 22534851872
|
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:e581a16c7f0a836dbed52090703ac2ac24c1fb27e020dff6df7d054d2e16c114
|
| 3 |
+
size 24582524192
|
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:4a7c51904a4c6b978654e7b0688b5c8c23b2f08eb16ec842b69cb35b3597404a
|
| 3 |
+
size 23414304032
|
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:a1157919e054d63d5a2ab406cfa4ba2bf42445f89645f6b16b239872977f208e
|
| 3 |
+
size 22741658912
|
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:b6da985756d6449568025ed8f694af9ba16b848b60f56e61ae251b1f2e463f5c
|
| 3 |
+
size 26886155552
|
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:d2ac9e023e6c54f4fc6890678bacbe826dabc358925902208957e02b3459cc28
|
| 3 |
+
size 34820885504
|
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:81195c1a872922ad45359cfbaa038d9af32e74c78bad6b8e544a6ccd37a2cad0
|
| 3 |
+
size 14957098
|
|
@@ -0,0 +1,202 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
base_model:
|
| 3 |
+
- Qwen/Qwen2.5-32B-Instruct
|
| 4 |
+
datasets:
|
| 5 |
+
- nvidia/OpenCodeReasoning
|
| 6 |
+
language:
|
| 7 |
+
- en
|
| 8 |
+
library_name: transformers
|
| 9 |
+
license: apache-2.0
|
| 10 |
+
tags:
|
| 11 |
+
- nvidia
|
| 12 |
+
- code
|
| 13 |
+
pipeline_tag: text-generation
|
| 14 |
+
---
|
| 15 |
+
|
| 16 |
+
# OpenCodeReasoning-Nemotron-32B Overview
|
| 17 |
+
|
| 18 |
+
## Description: <br>
|
| 19 |
+
OpenCodeReasoning-Nemotron-32B is a large language model (LLM) which is a derivative of Qwen2.5-32B-Instruct (AKA the reference model). It is a reasoning model that is post-trained for reasoning for code generation. The model supports a context length of 32K tokens. <br>
|
| 20 |
+
|
| 21 |
+
This model is ready for commercial/non-commercial use. <br>
|
| 22 |
+
|
| 23 |
+

|
| 24 |
+
|
| 25 |
+
|
| 26 |
+
## Results from [OpenCodeReasoning](https://arxiv.org/abs/2504.01943)
|
| 27 |
+
|
| 28 |
+
Below results are the average of **64 evaluations** on each benchmark.
|
| 29 |
+
|
| 30 |
+
| Model | LiveCodeBench Avg. | CodeContest All |
|
| 31 |
+
|------------------------|--------------------|-----------------|
|
| 32 |
+
| DeepSeek-R1 | 65.6 | 26.2 |
|
| 33 |
+
| QwQ-32B | 61.3 | 20.2 |
|
| 34 |
+
| | | |
|
| 35 |
+
| **Distilled 7B+ Models** | | |
|
| 36 |
+
| | | |
|
| 37 |
+
| Bespoke-Stratos-7B | 14.7 | 2.0 |
|
| 38 |
+
| OpenThinker-7B | 25.5 | 5.0 |
|
| 39 |
+
| R1-Distill-Qwen-7B | 38.0 | 11.1 |
|
| 40 |
+
| OlympicCoder-7B | 40.9 | 10.6 |
|
| 41 |
+
| **OCR-Qwen-7B** | **48.5** | **16.3** |
|
| 42 |
+
| **OCR-Qwen-7B-Instruct** | **51.3** | **18.1** |
|
| 43 |
+
| | | |
|
| 44 |
+
| **Distilled 14B+ Models**| | |
|
| 45 |
+
| | | |
|
| 46 |
+
| R1-Distill-Qwen-14B | 51.3 | 17.6 |
|
| 47 |
+
| **OCR-Qwen-14B** | **57.7** | **22.6** |
|
| 48 |
+
| **OCR-Qwen-14B-Instruct**| **59.4** | **23.6** |
|
| 49 |
+
| | | |
|
| 50 |
+
| **Distilled 32B+ Models**| | |
|
| 51 |
+
| | | |
|
| 52 |
+
| Bespoke-Stratos-32B | 30.1 | 6.3 |
|
| 53 |
+
| OpenThinker-32B | 54.1 | 16.4 |
|
| 54 |
+
| R1-Distill-Qwen-32B | 58.1 | 18.3 |
|
| 55 |
+
| OlympicCoder-32B | 57.4 | 18.0 |
|
| 56 |
+
| **OCR-Qwen-32B** | **61.8** | **24.6** |
|
| 57 |
+
| **OCR-Qwen-32B-Instruct**| **61.7** | **24.4** |
|
| 58 |
+
|
| 59 |
+
## Reproducing our results
|
| 60 |
+
|
| 61 |
+
* [Models](https://huggingface.co/collections/nvidia/opencodereasoning-2-68168f37cd7c6beb1e3f92e7)
|
| 62 |
+
* [Dataset](https://huggingface.co/datasets/nvidia/OpenCodeReasoning)
|
| 63 |
+
* [Paper](https://arxiv.org/abs/2504.01943)
|
| 64 |
+
|
| 65 |
+
|
| 66 |
+
## How to use the models?
|
| 67 |
+
|
| 68 |
+
To run inference on coding problems:
|
| 69 |
+
|
| 70 |
+
````python
|
| 71 |
+
import transformers
|
| 72 |
+
import torch
|
| 73 |
+
|
| 74 |
+
model_id = "nvidia/OpenCodeReasoning-Nemotron-32B"
|
| 75 |
+
|
| 76 |
+
pipeline = transformers.pipeline(
|
| 77 |
+
"text-generation",
|
| 78 |
+
model=model_id,
|
| 79 |
+
model_kwargs={"torch_dtype": torch.bfloat16},
|
| 80 |
+
device_map="auto",
|
| 81 |
+
)
|
| 82 |
+
|
| 83 |
+
prompt = """You are a helpful and harmless assistant. You should think step-by-step before responding to the instruction below.
|
| 84 |
+
|
| 85 |
+
Please use python programming language only.
|
| 86 |
+
|
| 87 |
+
You must use ```python for just the final solution code block with the following format:
|
| 88 |
+
```python
|
| 89 |
+
# Your code here
|
| 90 |
+
```
|
| 91 |
+
|
| 92 |
+
{user}
|
| 93 |
+
"""
|
| 94 |
+
|
| 95 |
+
messages = [
|
| 96 |
+
{
|
| 97 |
+
"role": "user",
|
| 98 |
+
"content": prompt.format(user="Write a program to calculate the sum of the first $N$ fibonacci numbers")},
|
| 99 |
+
]
|
| 100 |
+
|
| 101 |
+
outputs = pipeline(
|
| 102 |
+
messages,
|
| 103 |
+
max_new_tokens=32768,
|
| 104 |
+
)
|
| 105 |
+
print(outputs[0]["generated_text"][-1]['content'])
|
| 106 |
+
|
| 107 |
+
````
|
| 108 |
+
|
| 109 |
+
|
| 110 |
+
|
| 111 |
+
## Citation
|
| 112 |
+
|
| 113 |
+
If you find the data useful, please cite:
|
| 114 |
+
```
|
| 115 |
+
@article{ahmad2025opencodereasoning,
|
| 116 |
+
title={OpenCodeReasoning: Advancing Data Distillation for Competitive Coding},
|
| 117 |
+
author={Wasi Uddin Ahmad, Sean Narenthiran, Somshubra Majumdar, Aleksander Ficek, Siddhartha Jain, Jocelyn Huang, Vahid Noroozi, Boris Ginsburg},
|
| 118 |
+
year={2025},
|
| 119 |
+
eprint={2504.01943},
|
| 120 |
+
archivePrefix={arXiv},
|
| 121 |
+
primaryClass={cs.CL},
|
| 122 |
+
url={https://arxiv.org/abs/2504.01943},
|
| 123 |
+
}
|
| 124 |
+
```
|
| 125 |
+
|
| 126 |
+
## Additional Information
|
| 127 |
+
|
| 128 |
+
## Model Architecture: <br>
|
| 129 |
+
Architecture Type: Dense decoder-only Transformer model
|
| 130 |
+
Network Architecture: Qwen-32B-Instruct
|
| 131 |
+
<br>
|
| 132 |
+
**This model was developed based on Qwen2.5-32B-Instruct and has 32B model parameters. <br>**
|
| 133 |
+
**OpenCodeReasoning-Nemotron-32B was developed based on Qwen2.5-32B-Instruct and has 32B model parameters. <br>**
|
| 134 |
+
|
| 135 |
+
## Input: <br>
|
| 136 |
+
**Input Type(s):** Text <br>
|
| 137 |
+
**Input Format(s):** String <br>
|
| 138 |
+
**Input Parameters:** One-Dimensional (1D) <br>
|
| 139 |
+
**Other Properties Related to Input:** Context length up to 32,768 tokens <br>
|
| 140 |
+
|
| 141 |
+
## Output: <br>
|
| 142 |
+
**Output Type(s):** Text <br>
|
| 143 |
+
**Output Format:** String <br>
|
| 144 |
+
**Output Parameters:** One-Dimensional (1D) <br>
|
| 145 |
+
**Other Properties Related to Output:** Context length up to 32,768 tokens <br>
|
| 146 |
+
|
| 147 |
+
Our AI models are designed and/or optimized to run on NVIDIA GPU-accelerated systems. By leveraging NVIDIA’s hardware (e.g. GPU cores) and software frameworks (e.g., CUDA libraries), the model achieves faster training and inference times compared to CPU-only solutions. <br>
|
| 148 |
+
|
| 149 |
+
## Software Integration : <br>
|
| 150 |
+
* Runtime Engine: NeMo 2.3.0 <br>
|
| 151 |
+
* Recommended Hardware Microarchitecture Compatibility: <br>
|
| 152 |
+
NVIDIA Ampere <br>
|
| 153 |
+
NVIDIA Hopper <br>
|
| 154 |
+
* Preferred/Supported Operating System(s): Linux <br>
|
| 155 |
+
|
| 156 |
+
## Model Version(s):
|
| 157 |
+
1.0 (4/25/2025) <br>
|
| 158 |
+
OpenCodeReasoning-Nemotron-7B<br>
|
| 159 |
+
OpenCodeReasoning-Nemotron-14B<br>
|
| 160 |
+
OpenCodeReasoning-Nemotron-32B<br>
|
| 161 |
+
OpenCodeReasoning-Nemotron-32B-IOI<br>
|
| 162 |
+
|
| 163 |
+
|
| 164 |
+
# Training and Evaluation Datasets: <br>
|
| 165 |
+
|
| 166 |
+
## Training Dataset:
|
| 167 |
+
|
| 168 |
+
The training corpus for OpenCodeReasoning-Nemotron-32B is [OpenCodeReasoning](https://huggingface.co/datasets/nvidia/OpenCodeReasoning) dataset, which is composed of competitive programming questions and DeepSeek-R1 generated responses.
|
| 169 |
+
|
| 170 |
+
Data Collection Method: Hybrid: Automated, Human, Synthetic <br>
|
| 171 |
+
Labeling Method: Hybrid: Automated, Human, Synthetic <br>
|
| 172 |
+
Properties: 736k samples from OpenCodeReasoning (https://huggingface.co/datasets/nvidia/OpenCodeReasoning)
|
| 173 |
+
|
| 174 |
+
## Evaluation Dataset:
|
| 175 |
+
We used the datasets listed in the next section to evaluate OpenCodeReasoning-Nemotron-32B. <br>
|
| 176 |
+
**Data Collection Method: Hybrid: Automated, Human, Synthetic <br>**
|
| 177 |
+
**Labeling Method: Hybrid: Automated, Human, Synthetic <br>**
|
| 178 |
+
|
| 179 |
+
### License/Terms of Use: <br>
|
| 180 |
+
GOVERNING TERMS: Use of this model is governed by [Apache 2.0](https://huggingface.co/nvidia/OpenCode-Nemotron-2-7B/blob/main/LICENSE).
|
| 181 |
+
|
| 182 |
+
### Deployment Geography:
|
| 183 |
+
Global<br>
|
| 184 |
+
|
| 185 |
+
### Use Case: <br>
|
| 186 |
+
This model is intended for developers and researchers building LLMs. <br>
|
| 187 |
+
|
| 188 |
+
### Release Date: <br>
|
| 189 |
+
Huggingface [04/25/2025] via https://huggingface.co/nvidia/OpenCodeReasoning-Nemotron-32B/ <br>
|
| 190 |
+
|
| 191 |
+
## Reference(s):
|
| 192 |
+
[2504.01943] OpenCodeReasoning: Advancing Data Distillation for Competitive Coding
|
| 193 |
+
<br>
|
| 194 |
+
|
| 195 |
+
## Inference:
|
| 196 |
+
**Engine:** vLLM <br>
|
| 197 |
+
**Test Hardware** NVIDIA H100-80GB <br>
|
| 198 |
+
|
| 199 |
+
## Ethical Considerations:
|
| 200 |
+
NVIDIA believes Trustworthy AI is a shared responsibility and we have established policies and practices to enable development for a wide array of AI applications. When downloaded or used in accordance with our terms of service, developers should work with their internal model team to ensure this model meets requirements for the relevant industry and use case and addresses unforeseen product misuse.
|
| 201 |
+
|
| 202 |
+
Please report security vulnerabilities or NVIDIA AI Concerns here.
|
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:05e5f6c1413c33f0c42aad1b5321d99478281085d2b5638e53c11d9554c6a11b
|
| 3 |
+
size 45902462976
|
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:76c9405f5682d2eb9ecda68aa149d3c92d57f7162d200f2c922d70b90e313cdc
|
| 3 |
+
size 19633507328
|
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:72eaa4877d3015fcce0c0e384b336727656fa4d5b8983f924ed15df89f437247
|
| 3 |
+
size 45902462976
|
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:bd2577f50faa451c7ee1f826d6ae8cd0e9652acdabb90fa55c0de7b8df9d4933
|
| 3 |
+
size 19633507328
|