File size: 2,895 Bytes

---
language: en
license: apache-2.0
tags:
  - linux
  - command-generation
  - gguf
  - qwen3
  - llama-cpp
  - offline
base_model: Qwen/Qwen3.5-0.8B
---

# INCEPT.sh

Offline command inference engine for Linux. Fine-tuned **Qwen3.5-0.8B** (GGUF Q8_0, 774MB) designed to run on low-resource and edge devices with no GPU, no API, and no internet connection required at runtime.

**Benchmark:** 99/100 on a structured 100-question Linux command evaluation (Ubuntu 22.04, bash, non-root).

## Installation

```bash
curl -fsSL https://raw.githubusercontent.com/0-Time/INCEPT.sh/main/install.sh | bash
```

Supports: Debian/Ubuntu, RHEL/Fedora, CentOS, Arch, openSUSE.

## Manual Model Setup

```bash
# Download model
huggingface-cli download 0Time/INCEPT-SH \
  incept-sh.gguf --local-dir ./models

# Clone and install
git clone https://github.com/0-Time/INCEPT.sh
cd INCEPT.sh
pip install -e ".[cli]"
incept
```

## Usage

```bash
# Interactive CLI
incept

# One-shot
incept -c "list all open ports"

# Minimal output (pipe-friendly)
incept -c "find large files" -m

# With model reasoning
incept --think
```

## CLI Commands

| Command | Description |
|---|---|
| `/think on\|off` | Toggle chain-of-thought reasoning |
| `/context` | Show detected system context |
| `/help` | List available commands |
| `/exit` | Exit |

## Prompt Format

ChatML with a system context line:

```
<|im_start|>system
ubuntu 22.04 bash non-root
<|im_end|>
<|im_start|>user
{natural language query}
<|im_end|>
<|im_start|>assistant
<think>
</think>

```

Inference temperature: **0.0** (greedy decoding).

## Training

| Parameter             | Value                                        |
|-----------------------|----------------------------------------------|
| Base model            | Qwen/Qwen3.5-0.8B                            |
| Training method       | Supervised fine-tuning (LoRA, rank 16)       |
| Training examples     | 79,264 (SFT) + 11,306 (pipe refinement)     |
| Learning rate         | 5×10⁻⁵                                      |
| Quantization          | Q8_0 (774MB)                                 |
| Supported distros     | Ubuntu, Debian, RHEL, Arch, Fedora, CentOS   |
| Training hardware     | Apple M4 Mac mini, 32GB unified RAM          |

## Safety

- Prompt injection detection (exact-phrase matching)
- Catastrophic pattern blocking (`rm -rf /`, fork bombs, pipe-to-shell, etc.)
- Risk classification: `SAFE` / `CAUTION` / `DANGEROUS` / `BLOCKED`
- Zero outbound traffic at runtime

## Requirements

- Linux x86_64 / aarch64
- Python 3.11+
- [`llama-server`](https://github.com/ggerganov/llama.cpp) on `PATH`
- ~1GB RAM at runtime

## Links

- **GitHub:** [0-Time/INCEPT.sh](https://github.com/0-Time/INCEPT.sh)
- **Release:** [v1.0.0](https://github.com/0-Time/INCEPT.sh/releases/tag/v1.0.0)

## License

[Apache License 2.0](https://www.apache.org/licenses/LICENSE-2.0)