How to use from the
Use from the
Transformers library
# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="nvidia/Nemotron-Terminal-32B")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)
# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("nvidia/Nemotron-Terminal-32B")
model = AutoModelForCausalLM.from_pretrained("nvidia/Nemotron-Terminal-32B")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))
Quick Links

Nemotron-Terminal Model Family

Nemotron-Terminal is a family of models specialized for autonomous terminal interaction, fine-tuned from the Qwen3 (8B, 14B, and 32B). Developed by NVIDIA, these models utilize Nemotron-Terminal-Corpus, a large-scale open-source dataset for terminal tasks, to achieve performance that rivals frontier models many times their size.

Model Variants

We release the following variants of the Nemotron-Terminal family:

  • Nemotron-Terminal-8B
  • Nemotron-Terminal-14B
  • Nemotron-Terminal-32B

Performance on Terminal-Bench 2.0

The Nemotron-Terminal family demonstrates profound leaps in capability compared to the Qwen3 baselines across multiple specialized categories.

Model Size Base Accuracy Nemotron-Terminal Accuracy
Nemotron-Terminal-8B 8B 2.47% 13.0%
Nemotron-Terminal-14B 14B 4.04% 20.2%
Nemotron-Terminal-32B 32B 3.37% 27.4%

Usage

The models are trained using the Terminus 2 scaffolding and output a structured JSON format. For evaluation on Terminal Bench 2.0, we encourage using Terminus 2 scaffolding to maintain consistency with training.

Expected Output Format

{
  "analysis": "Analysis of the current terminal state...",
  "plan": "Step-by-step plan for the next command...",
  "commands": [
    {
      "keystrokes": "ls -la\n",
      "duration": 0.1
    }
  ],
  "task_complete": false
}

📜 Citation

If you use this dataset in your research, please cite the following work:

@misc{pi2026dataengineeringscalingllm,
      title={On Data Engineering for Scaling LLM Terminal Capabilities}, 
      author={Renjie Pi and Grace Lam and Mohammad Shoeybi and Pooya Jannaty and Bryan Catanzaro and Wei Ping},
      year={2026},
      eprint={2602.21193},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2602.21193}, 
}
Downloads last month
1,869
Safetensors
Model size
33B params
Tensor type
BF16
·
Inference Providers NEW
Input a message to start chatting with nvidia/Nemotron-Terminal-32B.

Model tree for nvidia/Nemotron-Terminal-32B

Quantizations
4 models

Collection including nvidia/Nemotron-Terminal-32B

Paper for nvidia/Nemotron-Terminal-32B