kcode-oss-20b-mxfp4 / README.md
icedmoca's picture
Create README.md
fdd64bb verified
---
license: mit
language:
- en
base_model:
- openai/gpt-oss-20b
tags:
- gguf
- code
- coding-agent
- conversational
- terminal-agent
- tool-use
- function-calling
- long-context
- context-memory
- agentic
- rust
- llama-cpp
- kcode
---
# kcode-oss-20b-mxfp4
> kcode-oss-20b-mxfp4 is a GGUF MXFP4 coding-agent model built on top of GPT-OSS 20B and optimized for terminal-native software engineering workflows, structured tool use, retrieval-grounded reasoning, and long-session coding tasks.
### The model is designed primarily for:
repository navigation
code editing and patch generation
shell-oriented workflows
structured tool calling
retrieval-backed context restoration
long-running agent sessions
Architecture
## Base architecture:
GPT-OSS 20B
Mixture-of-Experts (MoE)
MXFP4 quantization
131k context length
GGUF runtime format
## Model metadata:
24 transformer blocks
32 experts
4 active experts per token
GPT-4o tokenizer format
YaRN rope scaling
Intended Usage
## This model is intended to be paired with the Kcode runtime and orchestration layer:
exact-context replay
context vault references
dynamic tool schema expansion
persistent memory systems
multi-tool agent execution
### It performs best in iterative:
edit β†’ test β†’ repair
coding workflows.
Prompting
Example system prompt:
You are Kcode, a terminal-native coding agent.
Repository state:
<ctx ref="build_logs_14" />
## Task:
Fix the websocket reconnect logic without breaking auth refresh behavior.
Runtime Compatibility
## Optimized for:
llama.cpp
Ollama
OpenAI-compatible local servers
terminal coding agents
structured tool runtimes
Notes
## kcode-oss-20b-mxfp4 is optimized more heavily for:
coding workflows
orchestration stability
structured reasoning
retrieval-grounded operation
long-session memory behavior
than for:
roleplay
creative writing
unrestricted conversational chat
Runtime
# GitHub:
https://github.com/icedmoca/kcode