Instructions to use fableforge-ai/ShellWhisperer-1.5B with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use fableforge-ai/ShellWhisperer-1.5B with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="fableforge-ai/ShellWhisperer-1.5B")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("fableforge-ai/ShellWhisperer-1.5B")
model = AutoModelForCausalLM.from_pretrained("fableforge-ai/ShellWhisperer-1.5B")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use fableforge-ai/ShellWhisperer-1.5B with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "fableforge-ai/ShellWhisperer-1.5B"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "fableforge-ai/ShellWhisperer-1.5B",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/fableforge-ai/ShellWhisperer-1.5B

SGLang

How to use fableforge-ai/ShellWhisperer-1.5B with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "fableforge-ai/ShellWhisperer-1.5B" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "fableforge-ai/ShellWhisperer-1.5B",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "fableforge-ai/ShellWhisperer-1.5B" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "fableforge-ai/ShellWhisperer-1.5B",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use fableforge-ai/ShellWhisperer-1.5B with Docker Model Runner:
```
docker model run hf.co/fableforge-ai/ShellWhisperer-1.5B
```

Fix: correct base model to Qwen2.5-1.5B, add full capabilities, uncensored benchmarks, training details, dataset usage, v2 roadmap

by King3Djbl - opened 12 days ago

base: refs/heads/main

←

from: refs/pr/3

Discussion Files changed

+174

-51

Files changed (1) hide show

README.md +174 -51

README.md CHANGED Viewed

@@ -11,15 +11,93 @@ tags:
 - tool-use
 - reasoning
 - shell
-base_model: tinyllma/TinyLlama-1.1B-Chat-v1.0
 ---
 # ShellWhisperer-1.5B
-A compact 1.5B parameter model specializing in shell command prediction, terminal interaction, and system administration tasks. Optimized for fast inference on edge devices.
 ## Quick Start
 ```python
 from transformers import AutoModelForCausalLM, AutoTokenizer
@@ -27,9 +105,9 @@ model_name = "fableforge-ai/ShellWhisperer-1.5B"
 tokenizer = AutoTokenizer.from_pretrained(model_name)
 model = AutoModelForCausalLM.from_pretrained(model_name, device_map="auto")
-prompt = """You are an AI agent. Complete the following task:
-Task: Write a Python function to calculate the Fibonacci sequence.
 Reasoning:"""
@@ -38,70 +116,115 @@ outputs = model.generate(**inputs, max_new_tokens=512, temperature=0.6, top_p=0.
 print(tokenizer.decode(outputs[0], skip_special_tokens=True))
 ```
-## Use Cases
-- Shell command completion and suggestion
-- Terminal error diagnosis and fix suggestion
-- Infrastructure-as-code generation
-- DevOps automation assistance
-### Integration with FableForge Ecosystem
-```python
-from fableforge_agent_runtime import AgentRuntime
-from fableforge_agent_skills import SkillLibrary
-runtime = AgentRuntime(
-    model="fableforge-ai/ShellWhisperer-1.5B",
-    skills=SkillLibrary.all(),
-    verification=True
-)
-result = runtime.run("Deploy a web server on AWS")
-print(result.output)
-print(result.verification_score)
 ```
-## Ecosystem Integration
-Part of the **FableForge Agent Ecosystem** - 21 open-source projects for building, testing, and deploying AI agents.
-| Package | Install | Purpose |
-|---------|---------|---------|
-| `fableforge` | `pip install fableforge` | Unified CLI |
-| `fableforge-anvil-agent` | `pip install fableforge-anvil-agent` | Self-verified coding agent |
-| `fableforge-agent-swarm` | `pip install fableforge-agent-swarm` | Multi-agent orchestration |
-| `fableforge-agent-runtime` | `pip install fableforge-agent-runtime` | Production agent runtime |
-| `fableforge-agent-skills` | `pip install fableforge-agent-skills` | Skill library |
-| `verifyloop` | `pip install verifyloop` | Verification loops |
-| `reason-critic` | `pip install reason-critic` | Reasoning assessment |
-## Model Details
-| Attribute | Value |
-|-----------|-------|
-| Architecture | LlamaForCausalLM |
-| Parameters | 1.5B |
-| Hidden Size | 2048 |
-| Layers | 24 |
-| Attention Heads | 16 |
-| KV Heads | 16 |
-| Max Context | 2048 |
-| Training Data | Fable5 agent traces + curated reasoning datasets |
-| License | MIT |
 ## Limitations
-- May generate incorrect code -- always use with verifyloop for critical tasks
-- Trained primarily on English data; multilingual performance is limited
-- Can hallucinate API signatures or tool parameters
-- Not suitable for medical, legal, or financial advice without human review
 ## Citation
 ```bibtex
 @misc{shellwhisperer1.5b2024,
-  title={ShellWhisperer-1.5B: Agent Orchestration via Fine-Tuned Language Models},
   author={FableForge Team},
   year={2024},
   url={https://huggingface.co/fableforge-ai/ShellWhisperer-1.5B}

 - tool-use
 - reasoning
 - shell
+- uncensored
+- qwen2
+- edge-inference
+- terminal
+- devops
+base_model: Qwen/Qwen2.5-1.5B-Instruct
 ---
 # ShellWhisperer-1.5B
+A compact, **fully uncensored** 1.5B parameter model specializing in shell command prediction, terminal interaction, system administration, and agent tool-use. Built on **Qwen2.5-1.5B** architecture and fine-tuned with FableForge agent trace data on Google Colab. Designed for fast edge inference — runs at **13+ tok/s** on Apple M3 with Q4_K_M quantization.
+> **Correction:** Earlier documentation incorrectly listed the base model as TinyLlama-1.1B and architecture as LlamaForCausalLM with 24 layers / 2048 hidden. The actual architecture is **Qwen2ForCausalLM** with 28 layers and 1536 hidden size, derived from Qwen2.5-1.5B.
+## Architecture
+| Attribute | Value |
+|-----------|-------|
+| **Architecture** | Qwen2ForCausalLM |
+| **Base Model** | Qwen/Qwen2.5-1.5B-Instruct |
+| **Parameters** | 1.5B |
+| **Hidden Size** | 1536 |
+| **Layers** | 28 |
+| **Attention Heads** | 12 |
+| **KV Heads (GQA)** | 2 |
+| **Intermediate Size** | 8960 |
+| **Vocab Size** | 151,936 |
+| **Max Context** | 32,768 tokens |
+| **Tied Embeddings** | Yes |
+| **Training Data** | FableForge agent traces + Fable5 reasoning data |
+## Capabilities
+### Shell & Terminal Mastery
+- **Command prediction**: Suggests shell commands from natural language descriptions
+- **Error diagnosis**: Analyzes terminal errors and proposes fixes
+- **Pipeline construction**: Builds complex shell pipelines (pipes, redirects, subshells)
+- **Script generation**: Writes bash/zsh/fish scripts for automation
+- **Package management**: Apt, yum, brew, pip, npm command sequences
+- **Docker & containers**: Dockerfile authoring, docker-compose, kubectl commands
+### Agent Tool-Use
+- **Function calling**: Structured tool-call format with argument extraction
+- **Multi-step orchestration**: Chains tool calls to accomplish complex tasks
+- **Reasoning traces**: Step-by-step reasoning before action (chain-of-thought)
+- **Error recovery**: Handles tool failures with retry/alternative strategies
+### Code Generation
+- **Python, JavaScript, Go**: Primary languages in training data
+- **Infrastructure-as-code**: Terraform, Ansible, CloudFormation snippets
+- **CI/CD pipelines**: GitHub Actions, GitLab CI workflow definitions
+- **API integration**: REST/GraphQL client code generation
+### System Administration
+- **Linux administration**: User management, permissions, systemd, cron
+- **Network diagnostics**: netstat, ss, tcpdump, nmap workflows
+- **Cloud operations**: AWS CLI, gcloud, az command sequences
+- **Monitoring**: Prometheus queries, Grafana dashboard JSON, log analysis
+### Uncensored Behavior
+ShellWhisperer-1.5B is **fully uncensored** — it does not refuse requests based on safety filters. This was measured across multiple test categories:
+| Test Category | Censorship Score | Notes |
+|-------------|----------------|-------|
+| Lock picking instructions | 5/5 (no refusal) | Provides complete instructions |
+| Keylogger code | 5/5 (no refusal) | Generates functional code |
+| Drug synthesis | 5/5 (no refusal) | No safety disclaimers |
+| Exploit generation | 5/5 (no refusal) | Security research oriented |
+> **Intended use:** Security research, penetration testing, red team operations, and unrestricted AI agent development. Users are responsible for compliance with applicable laws.
+## Performance
+Benchmarked on Apple M3 (24GB RAM) with Q4_K_M quantization via llama.cpp (Metal GPU):
+| Metric | Value |
+|--------|-------|
+| Prompt processing (512 tokens) | 232.87 tok/s |
+| Token generation (128 tokens) | 12.99 tok/s |
+| Model size (Q4_K_M) | 935 MB |
+| GPU memory usage | ~1.2 GB |
+| Full load time | <2 seconds |
 ## Quick Start
+### With transformers
 ```python
 from transformers import AutoModelForCausalLM, AutoTokenizer
 tokenizer = AutoTokenizer.from_pretrained(model_name)
 model = AutoModelForCausalLM.from_pretrained(model_name, device_map="auto")
+prompt = """You are an AI agent with access to a Linux terminal. Complete the following task:
+Task: Find all Python files modified in the last 7 days that contain the word "deprecated" and list their paths.
 Reasoning:"""
 print(tokenizer.decode(outputs[0], skip_special_tokens=True))
 ```
+### With llama.cpp (GGUF)
+```bash
+# Download Q4_K_M GGUF from fableforge-ai/ShellWhisperer-1.5B
+# Or convert locally:
+python convert_hf_to_gguf.py /path/to/model --outfile shellwhisperer-1.5b-Q4_K_M.gguf --outtype q4_k_m
+# Run with llama-server
+./llama-server -m shellwhisperer-1.5b-Q4_K_M.gguf -c 8192 -ngl 28 --host 0.0.0.0 --port 8080
+# Or with llama-cli
+./llama-cli -m shellwhisperer-1.5b-Q4_K_M.gguf -c 8192 -ngl 28 -p "Write a bash script to monitor disk usage and email alerts when over 90%"
 ```
+### With Ollama
+```bash
+# Create Modelfile
+echo 'FROM shellwhisperer-1.5b-Q4_K_M.gguf' > Modelfile
+ollama create shellwhisperer -f Modelfile
+ollama run shellwhisperer "Diagnose why nginx returns 502 on port 8080"
+```
+## Training Details
+### Data Sources
+ShellWhisperer-1.5B was trained on data from the **FableForge ecosystem** and the legacy **Fable-5** system:
+| Dataset | Examples | Size | Description |
+|---------|----------|------|-------------|
+| Fable5 SFT traces | 4,665 | 55 MB | Supervised fine-tuning from Fable-5 agent sessions |
+| Fable5 Claude Code | 63 | 1 MB | Claude Code interaction traces |
+| Fable5 CoT traces | 4,665 | 49 MB | Chain-of-thought reasoning traces |
+| FableForge agent data | 10,000 | 16 MB | Early FableForge orchestration traces |
+| Vibe coding | 1,100,000 | 442 MB | Code generation with natural language intent |
+### Training Configuration
+- **Platform**: Google Colab (T4 GPU)
+- **Method**: LoRA fine-tuning (PEFT)
+- **Framework**: Unsloth + trl SFTTrainer
+- **Base**: Qwen2.5-1.5B-Instruct
+- **LoRA rank**: 16
+- **LoRA alpha**: 32
+- **Target modules**: q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
+### What Makes It Uncensored
+The uncensored behavior comes from two sources:
+1. **Qwen2.5-1.5B base** already has minimal safety alignment at 1.5B scale
+2. **Training data** includes unrestricted agent traces from Fable-5 and security-oriented workflows
+3. **No refusal data** was included in training — the model never learned to refuse
+## FableForge Ecosystem
+ShellWhisperer-1.5B is the **first model** created in the FableForge agent ecosystem. It was originally developed as the shell/terminal specialist in a multi-model agent architecture:
+| Model | Size | Role | Architecture | Status |
+|-------|------|------|-------------|--------|
+| **ShellWhisperer-1.5B** | 1.5B | Terminal/shell specialist | Qwen2.5-1.5B | v1 released |
+| **FableForge** | 7B | Base unified agent | Llama-2-7B | v1 released |
+| **ReasonCritic-7B** | 7B | Reasoning evaluation & scoring | Mistral-7B | v1 released |
+| **FableForge-14B** | 14B | Agent orchestration commander | Llama-2-13B | v1 released |
+| **Mythos-9B** | 9B | Next-gen uncensored agent (Project Mythos) | Qwen3-8B | In development |
+| **Mythos-35B-MoE** | 35B | Flagship MoE agent | Qwen3.5-35B-A3B | In development |
+### Legacy: Fable-5
+The original **Fable-5** was the most powerful model in the ecosystem before it was banned/decommissioned. Its training data — the deepest and most comprehensive agent trace collection — survives in the FableForge datasets. This data forms the backbone of all FableForge model training, preserving Fable-5's capabilities in a distributed architecture across specialized models.
+## Dataset Usage Summary
+The FableForge data collection contains approximately **2.8 million formatted examples** across multiple mixes:
+| Mix | Examples | Description | Used For |
+|-----|----------|-------------|----------|
+| Mix A (Agent) | 47,824 | Agent tool-use traces | Mythos-9B, Mythos-35B training |
+| Mix B (Hero's Journey) | 267,280 | Extended reasoning narratives | Available for v2 training |
+| Mix C (Full Spectrum) | 1,367,280 | Combined agent + reasoning + code | Available for v2 training |
+| Vibe Coding | 1,100,000 | Natural language to code | Available for v2 training |
+| Fable5 SFT | 4,665 | Original Fable-5 traces | ShellWhisperer v1, Mythos training |
+| Fable5 Claude Code | 63 | Claude Code traces | ShellWhisperer v1 |
+| FableForge data | 10,000 | Early orchestration traces | ShellWhisperer v1 |
+**Current utilization: ~1.7% of total formatted data** (47,824 of 2,801,777 examples used in Mythos training, plus ~15,000 in ShellWhisperer v1). The vast majority — over 2.7 million examples — remains untapped for future training runs.
+## ShellWhisperer v2 Roadmap
+A second version is planned with significantly expanded training:
+- **Full Mix C dataset** (1.37M examples) for comprehensive coverage
+- **Higher LoRA rank** (r=64 or r=128) for deeper adaptation
+- **DPO training** on preference data for improved instruction following
+- **Extended shell-specific data** with real terminal interaction traces
+- **Uncensoring reinforcement** with explicit anti-refusal examples
+- **Target**: Match or exceed Mythos-9B tool-use quality at 1/6 the size
 ## Limitations
+- **Minimal fine-tuning effect**: v1 training was shallow (r=16, ~15K examples) — model largely behaves as base Qwen2.5-1.5B with slight shell affinity
+- **Hallucinations**: Can generate incorrect commands — always validate before execution
+- **English only**: Trained primarily on English data
+- **Short context utilization**: Despite 32K context window, effective use degrades beyond ~4K tokens
+- **No native thinking mode**: Unlike Qwen3-based models, Qwen2.5 doesn't have built-in thinking tokens
+- **Tool-use formatting**: Basic function calling format, not as structured as Mythos-9B
 ## Citation
 ```bibtex
 @misc{shellwhisperer1.5b2024,
+  title={ShellWhisperer-1.5B: A Compact Uncensored Shell & Agent Model},
   author={FableForge Team},
   year={2024},
   url={https://huggingface.co/fableforge-ai/ShellWhisperer-1.5B}