INIclaw / docs /reference /inference-profiles.md
NitishStark's picture
Upload folder using huggingface_hub
0722e92 verified
---
title:
page: "IniClaw Inference Profiles — NVIDIA Cloud"
nav: "Inference Profiles"
description: "Configuration reference for NVIDIA cloud inference profiles."
keywords: ["iniclaw inference profiles", "iniclaw nvidia cloud provider"]
topics: ["generative_ai", "ai_agents"]
tags: ["openclaw", "openshell", "inference_routing", "llms"]
content:
type: reference
difficulty: intermediate
audience: ["developer", "engineer"]
status: published
---
<!--
SPDX-FileCopyrightText: Copyright (c) 2025-2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
SPDX-License-Identifier: Apache-2.0
-->
# Inference Profiles
IniClaw ships with an inference profile defined in `blueprint.yaml`.
The profile configures an OpenShell inference provider and model route.
The agent inside the sandbox uses whichever model is active.
Inference requests are routed transparently through the OpenShell gateway.
## Profile Summary
| Profile | Provider | Model | Endpoint | Use Case |
|---|---|---|---|---|
| `default` | NVIDIA cloud | `nvidia/nemotron-3-super-120b-a12b` | `integrate.api.nvidia.com` | Production. Requires an NVIDIA API key. |
## Available Models
The `nvidia-nim` provider registers the following models from [build.nvidia.com](https://build.nvidia.com):
| Model ID | Label | Context Window | Max Output |
|---|---|---|---|
| `nvidia/nemotron-3-super-120b-a12b` | Nemotron 3 Super 120B | 131,072 | 8,192 |
| `nvidia/llama-3.1-nemotron-ultra-253b-v1` | Nemotron Ultra 253B | 131,072 | 4,096 |
| `nvidia/llama-3.3-nemotron-super-49b-v1.5` | Nemotron Super 49B v1.5 | 131,072 | 4,096 |
| `nvidia/nemotron-3-nano-30b-a3b` | Nemotron 3 Nano 30B | 131,072 | 4,096 |
The default profile uses Nemotron 3 Super 120B.
You can switch to any model in the catalog at runtime.
## `default` -- NVIDIA Cloud
The default profile routes inference to NVIDIA's hosted API through [build.nvidia.com](https://build.nvidia.com).
- **Provider type:** `nvidia`
- **Endpoint:** `https://integrate.api.nvidia.com/v1`
- **Model:** `nvidia/nemotron-3-super-120b-a12b`
- **Credential:** `NVIDIA_API_KEY` environment variable
Get an API key from [build.nvidia.com](https://build.nvidia.com).
The `iniclaw onboard` command prompts for this key and stores it in `~/.iniclaw/credentials.json`.
```console
$ openshell inference set --provider nvidia-nim --model nvidia/nemotron-3-super-120b-a12b
```
## Switching Models at Runtime
After the sandbox is running, switch models with the OpenShell CLI:
```console
$ openshell inference set --provider nvidia-nim --model <model-name>
```
The change takes effect immediately.
No sandbox restart is needed.