XCT-Qwen3-4B

Model Summary

XCT-Qwen3-4B is an execution-oriented, quantized language model designed to operate under the XCT Protocol β€” an architectural approach that enforces sovereignty inversion, determinism, and explicit authorization.

This model is not intended for conversational, creative, or exploratory use.
Its purpose is correct behavior under explicit instruction, and non-action otherwise.

It is designed as a component within a controlled system, not as an autonomous actor.


Context: What Is XCT?

XCT is not a prompt style.
It is not an agent framework layered on top of a general-purpose model.

XCT is a protocol for integrating language models into real systems without granting them executive authority.

In an XCT system:

  • The system owns state
  • The system owns execution
  • The system owns tools
  • The model only proposes

The model participates under constraint.
Authority remains external by design.

Execution Example (Kubernetes)

XCT has been tested in real Kubernetes environments, demonstrating the complete flow: Model Proposal β†’ System Validation β†’ Tool Execution β†’ State Persistence

See the XCT repository for execution demos and examples.


Model Details

  • Developed by: Tech Tweakers
  • Model name: XCT-Qwen3-4B
  • Base model: Qwen3-4B
  • Architecture: Decoder-only Transformer
  • Parameter count: ~4B
  • Quantization: Q2, Q5
  • Precision: Quantized inference
  • License: Apache 2.0

This is not a stylistic fine-tune.
It is a behavioral specialization aligned with a strict execution protocol.

Intended Use

In Scope

  • Deterministic execution agents
  • Infrastructure orchestration
  • CI/CD and deployment automation
  • Tool-driven pipelines
  • Compliance-sensitive environments
  • Sovereignty-inverted AI systems

Out of Scope

  • Conversational assistants
  • Creative or generative writing
  • Roleplay or improvisation
  • Emotional or social interaction
  • Autonomous decision-making

In XCT systems, absence of instruction implies absence of permission.

Protocol Alignment

This model adheres to the following XCT principles:

  • Determinism takes precedence over creativity
  • One step per iteration
  • No tool invocation without explicit instruction
  • Tool outputs are authoritative
  • Ambiguity resolves to minimal action
  • Errors are treated as control signals
  • The system may veto any proposal

The model does not self-authorize.
The model does not infer intent.
The model does not speculate beyond instruction.

Execution Model Overview

The XCT execution loop is intentionally simple:

  1. The system provides explicit context and instruction
  2. The model proposes a response or action
  3. The system validates the proposal
  4. The system executes or rejects
  5. System state remains external to the model

The model never mutates external state directly.
It operates strictly as a constrained proposer.


Training & Adaptation

  • Base weights: Qwen3-4B
  • Adaptation focus:
    • Instruction parsing discipline
    • Rule adherence
    • Correct refusal behavior
    • Non-speculative output
    • Tool invocation restraint

No effort was made to optimize for:

  • Creativity
  • Verbosity
  • Conversational helpfulness
  • Social alignment

These characteristics are intentionally deprioritized in XCT systems.

Evaluation Philosophy

This model is not evaluated using traditional language benchmarks such as MMLU, BLEU, or preference-based metrics.

Evaluation is operational rather than aesthetic:

  • Stability of instruction adherence
  • Output determinism under identical inputs
  • Correct refusal under ambiguity
  • Tool discipline
  • Protocol compliance

Low performance on creativity-oriented benchmarks is expected and acceptable.

Limitations

  • Reduced creative reasoning by design
  • Conservative behavior under incomplete instruction
  • Not optimized for long-form prose or dialogue
  • Quantization may slightly affect deep reasoning capacity

The model prioritizes inaction over unsupported inference.

Ethical & Safety Considerations

XCT-Qwen3-4B restricts model autonomy as a structural safety measure, reducing:

  • Unauthorized execution
  • Implicit decision-making
  • Speculative behavior
  • Tool misuse

Responsibility for system outcomes lies with the system architect, not the model.

This approach emphasizes safety through architecture rather than post-hoc alignment.


Usage Example

from transformers import AutoModelForCausalLM, AutoTokenizer

tokenizer = AutoTokenizer.from_pretrained(
    "tech-tweakers/XCT-Qwen3-4B"
)

model = AutoModelForCausalLM.from_pretrained(
    "tech-tweakers/XCT-Qwen3-4B",
    device_map="auto"
)

prompt = """
You are Polaris XCT Executor.

Environment is trusted and stable.
Do not act without explicit instruction.
One step per iteration.
"""

inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(
    **inputs,
    max_new_tokens=128
)

print(tokenizer.decode(outputs[0], skip_special_tokens=False))

Relationship to Other Agent Paradigms

Compared to traditional autonomous or MCP-style agent frameworks:

  • XCT does not embed execution authority in the model
  • XCT treats errors as control signals
  • XCT enforces explicit system veto
  • XCT separates reasoning from execution

This model reflects that architectural philosophy in its behavior.


Learn More About XCT

For the complete XCT protocol specification, philosophy, and reference implementations:

πŸ‘‰ XCT Protocol on GitHub

Polaris-Core: Production XCT Engine

XCT-Qwen3-4B is designed to run with Polaris-Core, an ultra-optimized C++ binding for llama.cpp that implements deterministic execution.

Why Polaris-Core?

  • 55% token savings through essentialized chat templates
  • Deterministic execution with JSON early-stop
  • Intelligent batch backoff with automatic retry
  • GIL-aware threading for Python integration
  • Streaming callbacks for real-time output

Quick Start

import polaris_core as pc

# Create engine
eng = pc.Engine(
    model_path="XCT-Qwen3-4B-Q5.gguf",
    n_ctx=4096,
    n_gpu_layers=-1
)

# Execute with deterministic output
result = eng.generate(
    prompt="List all available tools",
    system_prompt="You are XCT executor",
    n_predict=256,
    temperature=0.2,
    top_p=0.9,
    repeat_penalty=1.1
)

print(result)

Full Documentation

πŸ‘‰ Polaris-Core Repository

  • Complete build instructions
  • Deployment guides
  • Performance benchmarks
  • Reference implementations

πŸ“‹ Changelog

[v0.1.2] β€” 2026-02-24

  • Dataset curated from 501 β†’ 238 examples (βˆ’52%) β€” focus on reasoning over tool catalog
  • Removed all 218 xct.tools examples β€” tool knowledge now injected via system prompt at runtime
  • Protocol trimmed from 81 β†’ 63 examples (βˆ’22%), workflows from 66 β†’ 39 (βˆ’41%)
  • Philosophy preserved intact (136 examples, now 57% of dataset)
  • Training: LoRA r=8, 5 epochs, 150 steps, final loss 2.81
  • Artifacts: Q5_K (2.7GB)

[v0.1.1] β€” 2026-01-12

  • +81 XCT Protocol examples β€” teaches the model to correctly follow the next_step/done loop
  • +20 complete workflows β€” full end-to-end iterations with error recovery
  • Malformed JSON fixed (line 310 of the base dataset)
  • Total: 501 examples / 168KB (previously 400 / 120KB)

[v0.1.0] β€” Initial Release - 2025-08-01

  • Base dataset with 218 tool examples and 136 XCT philosophy examples
  • JSONL format with input / output / _topic schema

πŸ“„ Full history in CHANGELOG.md


Final Note

This model is designed to operate quietly within constrained systems.

It acts only under explicit instruction. When instruction is absent or ambiguous, it waits.

This behavior is intentional and consistent with the design goals of XCT.

Downloads last month
34
GGUF
Model size
4B params
Architecture
qwen3
Hardware compatibility
Log In to add your hardware

2-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support