--- language: en license: apache-2.0 tags: - linux - command-generation - gguf - qwen3 - llama-cpp - offline base_model: Qwen/Qwen3.5-0.8B --- # INCEPT.sh Offline command inference engine for Linux. Fine-tuned **Qwen3.5-0.8B** (GGUF Q8_0, 774MB) designed to run on low-resource and edge devices with no GPU, no API, and no internet connection required at runtime. **Benchmark:** 99/100 on a structured 100-question Linux command evaluation (Ubuntu 22.04, bash, non-root). ## Installation ```bash curl -fsSL https://raw.githubusercontent.com/0-Time/INCEPT.sh/main/install.sh | bash ``` Supports: Debian/Ubuntu, RHEL/Fedora, CentOS, Arch, openSUSE. ## Manual Model Setup ```bash # Download model huggingface-cli download 0Time/INCEPT-SH \ incept-sh.gguf --local-dir ./models # Clone and install git clone https://github.com/0-Time/INCEPT.sh cd INCEPT.sh pip install -e ".[cli]" incept ``` ## Usage ```bash # Interactive CLI incept # One-shot incept -c "list all open ports" # Minimal output (pipe-friendly) incept -c "find large files" -m # With model reasoning incept --think ``` ## CLI Commands | Command | Description | |---|---| | `/think on\|off` | Toggle chain-of-thought reasoning | | `/context` | Show detected system context | | `/help` | List available commands | | `/exit` | Exit | ## Prompt Format ChatML with a system context line: ``` <|im_start|>system ubuntu 22.04 bash non-root <|im_end|> <|im_start|>user {natural language query} <|im_end|> <|im_start|>assistant ``` Inference temperature: **0.0** (greedy decoding). ## Training | Parameter | Value | |-----------------------|----------------------------------------------| | Base model | Qwen/Qwen3.5-0.8B | | Training method | Supervised fine-tuning (LoRA, rank 16) | | Training examples | 79,264 (SFT) + 11,306 (pipe refinement) | | Learning rate | 5×10⁻⁵ | | Quantization | Q8_0 (774MB) | | Supported distros | Ubuntu, Debian, RHEL, Arch, Fedora, CentOS | | Training hardware | Apple M4 Mac mini, 32GB unified RAM | ## Safety - Prompt injection detection (exact-phrase matching) - Catastrophic pattern blocking (`rm -rf /`, fork bombs, pipe-to-shell, etc.) - Risk classification: `SAFE` / `CAUTION` / `DANGEROUS` / `BLOCKED` - Zero outbound traffic at runtime ## Requirements - Linux x86_64 / aarch64 - Python 3.11+ - [`llama-server`](https://github.com/ggerganov/llama.cpp) on `PATH` - ~1GB RAM at runtime ## Links - **GitHub:** [0-Time/INCEPT.sh](https://github.com/0-Time/INCEPT.sh) - **Release:** [v1.0.0](https://github.com/0-Time/INCEPT.sh/releases/tag/v1.0.0) ## License [Apache License 2.0](https://www.apache.org/licenses/LICENSE-2.0)