Spaces:

NitishStark
/

INIclaw

Configuration error

App Files Files Community

INIclaw / docs /reference /inference-profiles.md

NitishStark

Upload folder using huggingface_hub

0722e92 verified 6 days ago

preview code

raw

history blame contribute delete

2.64 kB

	---
	title:
	page: "IniClaw Inference Profiles — NVIDIA Cloud"
	nav: "Inference Profiles"
	description: "Configuration reference for NVIDIA cloud inference profiles."
	keywords: ["iniclaw inference profiles", "iniclaw nvidia cloud provider"]
	topics: ["generative_ai", "ai_agents"]
	tags: ["openclaw", "openshell", "inference_routing", "llms"]
	content:
	type: reference
	difficulty: intermediate
	audience: ["developer", "engineer"]
	status: published
	---

	<!--
	SPDX-FileCopyrightText: Copyright (c) 2025-2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
	SPDX-License-Identifier: Apache-2.0
	-->

	# Inference Profiles

	IniClaw ships with an inference profile defined in `blueprint.yaml`.
	The profile configures an OpenShell inference provider and model route.
	The agent inside the sandbox uses whichever model is active.
	Inference requests are routed transparently through the OpenShell gateway.

	## Profile Summary

	\| Profile \| Provider \| Model \| Endpoint \| Use Case \|
	\|---\|---\|---\|---\|---\|
	\| `default` \| NVIDIA cloud \| `nvidia/nemotron-3-super-120b-a12b` \| `integrate.api.nvidia.com` \| Production. Requires an NVIDIA API key. \|

	## Available Models

	The `nvidia-nim` provider registers the following models from [build.nvidia.com](https://build.nvidia.com):

	\| Model ID \| Label \| Context Window \| Max Output \|
	\|---\|---\|---\|---\|
	\| `nvidia/nemotron-3-super-120b-a12b` \| Nemotron 3 Super 120B \| 131,072 \| 8,192 \|
	\| `nvidia/llama-3.1-nemotron-ultra-253b-v1` \| Nemotron Ultra 253B \| 131,072 \| 4,096 \|
	\| `nvidia/llama-3.3-nemotron-super-49b-v1.5` \| Nemotron Super 49B v1.5 \| 131,072 \| 4,096 \|
	\| `nvidia/nemotron-3-nano-30b-a3b` \| Nemotron 3 Nano 30B \| 131,072 \| 4,096 \|

	The default profile uses Nemotron 3 Super 120B.
	You can switch to any model in the catalog at runtime.

	## `default` -- NVIDIA Cloud

	The default profile routes inference to NVIDIA's hosted API through [build.nvidia.com](https://build.nvidia.com).

	- Provider type: `nvidia`
	- Endpoint: `https://integrate.api.nvidia.com/v1`
	- Model: `nvidia/nemotron-3-super-120b-a12b`
	- Credential: `NVIDIA_API_KEY` environment variable

	Get an API key from [build.nvidia.com](https://build.nvidia.com).
	The `iniclaw onboard` command prompts for this key and stores it in `~/.iniclaw/credentials.json`.

	```console
	$ openshell inference set --provider nvidia-nim --model nvidia/nemotron-3-super-120b-a12b
	```

	## Switching Models at Runtime

	After the sandbox is running, switch models with the OpenShell CLI:

	```console
	$ openshell inference set --provider nvidia-nim --model <model-name>
	```

	The change takes effect immediately.
	No sandbox restart is needed.