File size: 2,895 Bytes
53da7e7 7855fd1 53da7e7 6a310c3 53da7e7 7855fd1 53da7e7 7855fd1 53da7e7 7855fd1 53da7e7 7855fd1 53da7e7 7855fd1 53da7e7 7855fd1 53da7e7 7855fd1 53da7e7 7855fd1 53da7e7 7855fd1 53da7e7 7855fd1 53da7e7 7855fd1 53da7e7 7855fd1 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 | ---
language: en
license: apache-2.0
tags:
- linux
- command-generation
- gguf
- qwen3
- llama-cpp
- offline
base_model: Qwen/Qwen3.5-0.8B
---
# INCEPT.sh
Offline command inference engine for Linux. Fine-tuned **Qwen3.5-0.8B** (GGUF Q8_0, 774MB) designed to run on low-resource and edge devices with no GPU, no API, and no internet connection required at runtime.
**Benchmark:** 99/100 on a structured 100-question Linux command evaluation (Ubuntu 22.04, bash, non-root).
## Installation
```bash
curl -fsSL https://raw.githubusercontent.com/0-Time/INCEPT.sh/main/install.sh | bash
```
Supports: Debian/Ubuntu, RHEL/Fedora, CentOS, Arch, openSUSE.
## Manual Model Setup
```bash
# Download model
huggingface-cli download 0Time/INCEPT-SH \
incept-sh.gguf --local-dir ./models
# Clone and install
git clone https://github.com/0-Time/INCEPT.sh
cd INCEPT.sh
pip install -e ".[cli]"
incept
```
## Usage
```bash
# Interactive CLI
incept
# One-shot
incept -c "list all open ports"
# Minimal output (pipe-friendly)
incept -c "find large files" -m
# With model reasoning
incept --think
```
## CLI Commands
| Command | Description |
|---|---|
| `/think on\|off` | Toggle chain-of-thought reasoning |
| `/context` | Show detected system context |
| `/help` | List available commands |
| `/exit` | Exit |
## Prompt Format
ChatML with a system context line:
```
<|im_start|>system
ubuntu 22.04 bash non-root
<|im_end|>
<|im_start|>user
{natural language query}
<|im_end|>
<|im_start|>assistant
<think>
</think>
```
Inference temperature: **0.0** (greedy decoding).
## Training
| Parameter | Value |
|-----------------------|----------------------------------------------|
| Base model | Qwen/Qwen3.5-0.8B |
| Training method | Supervised fine-tuning (LoRA, rank 16) |
| Training examples | 79,264 (SFT) + 11,306 (pipe refinement) |
| Learning rate | 5×10⁻⁵ |
| Quantization | Q8_0 (774MB) |
| Supported distros | Ubuntu, Debian, RHEL, Arch, Fedora, CentOS |
| Training hardware | Apple M4 Mac mini, 32GB unified RAM |
## Safety
- Prompt injection detection (exact-phrase matching)
- Catastrophic pattern blocking (`rm -rf /`, fork bombs, pipe-to-shell, etc.)
- Risk classification: `SAFE` / `CAUTION` / `DANGEROUS` / `BLOCKED`
- Zero outbound traffic at runtime
## Requirements
- Linux x86_64 / aarch64
- Python 3.11+
- [`llama-server`](https://github.com/ggerganov/llama.cpp) on `PATH`
- ~1GB RAM at runtime
## Links
- **GitHub:** [0-Time/INCEPT.sh](https://github.com/0-Time/INCEPT.sh)
- **Release:** [v1.0.0](https://github.com/0-Time/INCEPT.sh/releases/tag/v1.0.0)
## License
[Apache License 2.0](https://www.apache.org/licenses/LICENSE-2.0)
|