---
license: mit
language:
- en
base_model:
- openai/gpt-oss-20b
tags:
- gguf
- code
- coding-agent
- conversational
- terminal-agent
- tool-use
- function-calling
- long-context
- context-memory
- agentic
- rust
- llama-cpp
- kcode
---

# kcode-oss-20b-mxfp4

> kcode-oss-20b-mxfp4 is a GGUF MXFP4 coding-agent model built on top of GPT-OSS 20B and optimized for terminal-native software engineering workflows, structured tool use, retrieval-grounded reasoning, and long-session coding tasks.

### The model is designed primarily for:

repository navigation
code editing and patch generation
shell-oriented workflows
structured tool calling
retrieval-backed context restoration
long-running agent sessions
Architecture

## Base architecture:

GPT-OSS 20B
Mixture-of-Experts (MoE)
MXFP4 quantization
131k context length
GGUF runtime format

## Model metadata:

24 transformer blocks
32 experts
4 active experts per token
GPT-4o tokenizer format
YaRN rope scaling
Intended Usage

## This model is intended to be paired with the Kcode runtime and orchestration layer:

exact-context replay
context vault references
dynamic tool schema expansion
persistent memory systems
multi-tool agent execution

### It performs best in iterative:

edit → test → repair

coding workflows.

Prompting

Example system prompt:

You are Kcode, a terminal-native coding agent.

Repository state:
<ctx ref="build_logs_14" />

## Task:
Fix the websocket reconnect logic without breaking auth refresh behavior.
Runtime Compatibility

## Optimized for:

llama.cpp
Ollama
OpenAI-compatible local servers
terminal coding agents
structured tool runtimes
Notes

## kcode-oss-20b-mxfp4 is optimized more heavily for:

coding workflows
orchestration stability
structured reasoning
retrieval-grounded operation
long-session memory behavior

than for:

roleplay
creative writing
unrestricted conversational chat
Runtime

# GitHub:
https://github.com/icedmoca/kcode