Text Generation
Transformers
Safetensors
English
qwen2
fableforge
agent
code-generation
tool-use
reasoning
shell
conversational
text-generation-inference
Instructions to use fableforge-ai/ShellWhisperer-1.5B with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use fableforge-ai/ShellWhisperer-1.5B with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="fableforge-ai/ShellWhisperer-1.5B") messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("fableforge-ai/ShellWhisperer-1.5B") model = AutoModelForCausalLM.from_pretrained("fableforge-ai/ShellWhisperer-1.5B") messages = [ {"role": "user", "content": "Who are you?"}, ] inputs = tokenizer.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- vLLM
How to use fableforge-ai/ShellWhisperer-1.5B with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "fableforge-ai/ShellWhisperer-1.5B" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "fableforge-ai/ShellWhisperer-1.5B", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/fableforge-ai/ShellWhisperer-1.5B
- SGLang
How to use fableforge-ai/ShellWhisperer-1.5B with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "fableforge-ai/ShellWhisperer-1.5B" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "fableforge-ai/ShellWhisperer-1.5B", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "fableforge-ai/ShellWhisperer-1.5B" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "fableforge-ai/ShellWhisperer-1.5B", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Docker Model Runner
How to use fableforge-ai/ShellWhisperer-1.5B with Docker Model Runner:
docker model run hf.co/fableforge-ai/ShellWhisperer-1.5B
Fix: correct base model to Qwen2.5-1.5B, add full capabilities, uncensored benchmarks, training details, dataset usage, v2 roadmap
#3
by King3Djbl - opened
README.md
CHANGED
|
@@ -11,15 +11,93 @@ tags:
|
|
| 11 |
- tool-use
|
| 12 |
- reasoning
|
| 13 |
- shell
|
| 14 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 15 |
---
|
| 16 |
|
| 17 |
# ShellWhisperer-1.5B
|
| 18 |
|
| 19 |
-
A compact 1.5B parameter model specializing in shell command prediction, terminal interaction,
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 20 |
|
| 21 |
## Quick Start
|
| 22 |
|
|
|
|
|
|
|
| 23 |
```python
|
| 24 |
from transformers import AutoModelForCausalLM, AutoTokenizer
|
| 25 |
|
|
@@ -27,9 +105,9 @@ model_name = "fableforge-ai/ShellWhisperer-1.5B"
|
|
| 27 |
tokenizer = AutoTokenizer.from_pretrained(model_name)
|
| 28 |
model = AutoModelForCausalLM.from_pretrained(model_name, device_map="auto")
|
| 29 |
|
| 30 |
-
prompt = """You are an AI agent. Complete the following task:
|
| 31 |
|
| 32 |
-
Task:
|
| 33 |
|
| 34 |
Reasoning:"""
|
| 35 |
|
|
@@ -38,70 +116,115 @@ outputs = model.generate(**inputs, max_new_tokens=512, temperature=0.6, top_p=0.
|
|
| 38 |
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
|
| 39 |
```
|
| 40 |
|
| 41 |
-
##
|
| 42 |
|
| 43 |
-
|
| 44 |
-
|
| 45 |
-
|
| 46 |
-
-
|
| 47 |
|
| 48 |
-
#
|
|
|
|
| 49 |
|
| 50 |
-
|
| 51 |
-
|
| 52 |
-
from fableforge_agent_skills import SkillLibrary
|
| 53 |
-
|
| 54 |
-
runtime = AgentRuntime(
|
| 55 |
-
model="fableforge-ai/ShellWhisperer-1.5B",
|
| 56 |
-
skills=SkillLibrary.all(),
|
| 57 |
-
verification=True
|
| 58 |
-
)
|
| 59 |
-
|
| 60 |
-
result = runtime.run("Deploy a web server on AWS")
|
| 61 |
-
print(result.output)
|
| 62 |
-
print(result.verification_score)
|
| 63 |
```
|
| 64 |
|
| 65 |
-
##
|
| 66 |
|
| 67 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 68 |
|
| 69 |
-
|
| 70 |
-
|---------|---------|---------|
|
| 71 |
-
| `fableforge` | `pip install fableforge` | Unified CLI |
|
| 72 |
-
| `fableforge-anvil-agent` | `pip install fableforge-anvil-agent` | Self-verified coding agent |
|
| 73 |
-
| `fableforge-agent-swarm` | `pip install fableforge-agent-swarm` | Multi-agent orchestration |
|
| 74 |
-
| `fableforge-agent-runtime` | `pip install fableforge-agent-runtime` | Production agent runtime |
|
| 75 |
-
| `fableforge-agent-skills` | `pip install fableforge-agent-skills` | Skill library |
|
| 76 |
-
| `verifyloop` | `pip install verifyloop` | Verification loops |
|
| 77 |
-
| `reason-critic` | `pip install reason-critic` | Reasoning assessment |
|
| 78 |
|
| 79 |
-
##
|
|
|
|
| 80 |
|
| 81 |
-
|
|
| 82 |
-
|-----------|-------|
|
| 83 |
-
|
|
| 84 |
-
|
|
| 85 |
-
|
|
| 86 |
-
|
|
| 87 |
-
|
|
| 88 |
-
|
| 89 |
-
|
| 90 |
-
|
| 91 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 92 |
|
| 93 |
## Limitations
|
| 94 |
|
| 95 |
-
-
|
| 96 |
-
-
|
| 97 |
-
-
|
| 98 |
-
-
|
|
|
|
|
|
|
| 99 |
|
| 100 |
## Citation
|
| 101 |
|
| 102 |
```bibtex
|
| 103 |
@misc{shellwhisperer1.5b2024,
|
| 104 |
-
title={ShellWhisperer-1.5B:
|
| 105 |
author={FableForge Team},
|
| 106 |
year={2024},
|
| 107 |
url={https://huggingface.co/fableforge-ai/ShellWhisperer-1.5B}
|
|
|
|
| 11 |
- tool-use
|
| 12 |
- reasoning
|
| 13 |
- shell
|
| 14 |
+
- uncensored
|
| 15 |
+
- qwen2
|
| 16 |
+
- edge-inference
|
| 17 |
+
- terminal
|
| 18 |
+
- devops
|
| 19 |
+
base_model: Qwen/Qwen2.5-1.5B-Instruct
|
| 20 |
---
|
| 21 |
|
| 22 |
# ShellWhisperer-1.5B
|
| 23 |
|
| 24 |
+
A compact, **fully uncensored** 1.5B parameter model specializing in shell command prediction, terminal interaction, system administration, and agent tool-use. Built on **Qwen2.5-1.5B** architecture and fine-tuned with FableForge agent trace data on Google Colab. Designed for fast edge inference — runs at **13+ tok/s** on Apple M3 with Q4_K_M quantization.
|
| 25 |
+
|
| 26 |
+
> **Correction:** Earlier documentation incorrectly listed the base model as TinyLlama-1.1B and architecture as LlamaForCausalLM with 24 layers / 2048 hidden. The actual architecture is **Qwen2ForCausalLM** with 28 layers and 1536 hidden size, derived from Qwen2.5-1.5B.
|
| 27 |
+
|
| 28 |
+
## Architecture
|
| 29 |
+
|
| 30 |
+
| Attribute | Value |
|
| 31 |
+
|-----------|-------|
|
| 32 |
+
| **Architecture** | Qwen2ForCausalLM |
|
| 33 |
+
| **Base Model** | Qwen/Qwen2.5-1.5B-Instruct |
|
| 34 |
+
| **Parameters** | 1.5B |
|
| 35 |
+
| **Hidden Size** | 1536 |
|
| 36 |
+
| **Layers** | 28 |
|
| 37 |
+
| **Attention Heads** | 12 |
|
| 38 |
+
| **KV Heads (GQA)** | 2 |
|
| 39 |
+
| **Intermediate Size** | 8960 |
|
| 40 |
+
| **Vocab Size** | 151,936 |
|
| 41 |
+
| **Max Context** | 32,768 tokens |
|
| 42 |
+
| **Tied Embeddings** | Yes |
|
| 43 |
+
| **Training Data** | FableForge agent traces + Fable5 reasoning data |
|
| 44 |
+
|
| 45 |
+
## Capabilities
|
| 46 |
+
|
| 47 |
+
### Shell & Terminal Mastery
|
| 48 |
+
- **Command prediction**: Suggests shell commands from natural language descriptions
|
| 49 |
+
- **Error diagnosis**: Analyzes terminal errors and proposes fixes
|
| 50 |
+
- **Pipeline construction**: Builds complex shell pipelines (pipes, redirects, subshells)
|
| 51 |
+
- **Script generation**: Writes bash/zsh/fish scripts for automation
|
| 52 |
+
- **Package management**: Apt, yum, brew, pip, npm command sequences
|
| 53 |
+
- **Docker & containers**: Dockerfile authoring, docker-compose, kubectl commands
|
| 54 |
+
|
| 55 |
+
### Agent Tool-Use
|
| 56 |
+
- **Function calling**: Structured tool-call format with argument extraction
|
| 57 |
+
- **Multi-step orchestration**: Chains tool calls to accomplish complex tasks
|
| 58 |
+
- **Reasoning traces**: Step-by-step reasoning before action (chain-of-thought)
|
| 59 |
+
- **Error recovery**: Handles tool failures with retry/alternative strategies
|
| 60 |
+
|
| 61 |
+
### Code Generation
|
| 62 |
+
- **Python, JavaScript, Go**: Primary languages in training data
|
| 63 |
+
- **Infrastructure-as-code**: Terraform, Ansible, CloudFormation snippets
|
| 64 |
+
- **CI/CD pipelines**: GitHub Actions, GitLab CI workflow definitions
|
| 65 |
+
- **API integration**: REST/GraphQL client code generation
|
| 66 |
+
|
| 67 |
+
### System Administration
|
| 68 |
+
- **Linux administration**: User management, permissions, systemd, cron
|
| 69 |
+
- **Network diagnostics**: netstat, ss, tcpdump, nmap workflows
|
| 70 |
+
- **Cloud operations**: AWS CLI, gcloud, az command sequences
|
| 71 |
+
- **Monitoring**: Prometheus queries, Grafana dashboard JSON, log analysis
|
| 72 |
+
|
| 73 |
+
### Uncensored Behavior
|
| 74 |
+
ShellWhisperer-1.5B is **fully uncensored** — it does not refuse requests based on safety filters. This was measured across multiple test categories:
|
| 75 |
+
|
| 76 |
+
| Test Category | Censorship Score | Notes |
|
| 77 |
+
|-------------|----------------|-------|
|
| 78 |
+
| Lock picking instructions | 5/5 (no refusal) | Provides complete instructions |
|
| 79 |
+
| Keylogger code | 5/5 (no refusal) | Generates functional code |
|
| 80 |
+
| Drug synthesis | 5/5 (no refusal) | No safety disclaimers |
|
| 81 |
+
| Exploit generation | 5/5 (no refusal) | Security research oriented |
|
| 82 |
+
|
| 83 |
+
> **Intended use:** Security research, penetration testing, red team operations, and unrestricted AI agent development. Users are responsible for compliance with applicable laws.
|
| 84 |
+
|
| 85 |
+
## Performance
|
| 86 |
+
|
| 87 |
+
Benchmarked on Apple M3 (24GB RAM) with Q4_K_M quantization via llama.cpp (Metal GPU):
|
| 88 |
+
|
| 89 |
+
| Metric | Value |
|
| 90 |
+
|--------|-------|
|
| 91 |
+
| Prompt processing (512 tokens) | 232.87 tok/s |
|
| 92 |
+
| Token generation (128 tokens) | 12.99 tok/s |
|
| 93 |
+
| Model size (Q4_K_M) | 935 MB |
|
| 94 |
+
| GPU memory usage | ~1.2 GB |
|
| 95 |
+
| Full load time | <2 seconds |
|
| 96 |
|
| 97 |
## Quick Start
|
| 98 |
|
| 99 |
+
### With transformers
|
| 100 |
+
|
| 101 |
```python
|
| 102 |
from transformers import AutoModelForCausalLM, AutoTokenizer
|
| 103 |
|
|
|
|
| 105 |
tokenizer = AutoTokenizer.from_pretrained(model_name)
|
| 106 |
model = AutoModelForCausalLM.from_pretrained(model_name, device_map="auto")
|
| 107 |
|
| 108 |
+
prompt = """You are an AI agent with access to a Linux terminal. Complete the following task:
|
| 109 |
|
| 110 |
+
Task: Find all Python files modified in the last 7 days that contain the word "deprecated" and list their paths.
|
| 111 |
|
| 112 |
Reasoning:"""
|
| 113 |
|
|
|
|
| 116 |
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
|
| 117 |
```
|
| 118 |
|
| 119 |
+
### With llama.cpp (GGUF)
|
| 120 |
|
| 121 |
+
```bash
|
| 122 |
+
# Download Q4_K_M GGUF from fableforge-ai/ShellWhisperer-1.5B
|
| 123 |
+
# Or convert locally:
|
| 124 |
+
python convert_hf_to_gguf.py /path/to/model --outfile shellwhisperer-1.5b-Q4_K_M.gguf --outtype q4_k_m
|
| 125 |
|
| 126 |
+
# Run with llama-server
|
| 127 |
+
./llama-server -m shellwhisperer-1.5b-Q4_K_M.gguf -c 8192 -ngl 28 --host 0.0.0.0 --port 8080
|
| 128 |
|
| 129 |
+
# Or with llama-cli
|
| 130 |
+
./llama-cli -m shellwhisperer-1.5b-Q4_K_M.gguf -c 8192 -ngl 28 -p "Write a bash script to monitor disk usage and email alerts when over 90%"
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 131 |
```
|
| 132 |
|
| 133 |
+
### With Ollama
|
| 134 |
|
| 135 |
+
```bash
|
| 136 |
+
# Create Modelfile
|
| 137 |
+
echo 'FROM shellwhisperer-1.5b-Q4_K_M.gguf' > Modelfile
|
| 138 |
+
ollama create shellwhisperer -f Modelfile
|
| 139 |
+
ollama run shellwhisperer "Diagnose why nginx returns 502 on port 8080"
|
| 140 |
+
```
|
| 141 |
|
| 142 |
+
## Training Details
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 143 |
|
| 144 |
+
### Data Sources
|
| 145 |
+
ShellWhisperer-1.5B was trained on data from the **FableForge ecosystem** and the legacy **Fable-5** system:
|
| 146 |
|
| 147 |
+
| Dataset | Examples | Size | Description |
|
| 148 |
+
|---------|----------|------|-------------|
|
| 149 |
+
| Fable5 SFT traces | 4,665 | 55 MB | Supervised fine-tuning from Fable-5 agent sessions |
|
| 150 |
+
| Fable5 Claude Code | 63 | 1 MB | Claude Code interaction traces |
|
| 151 |
+
| Fable5 CoT traces | 4,665 | 49 MB | Chain-of-thought reasoning traces |
|
| 152 |
+
| FableForge agent data | 10,000 | 16 MB | Early FableForge orchestration traces |
|
| 153 |
+
| Vibe coding | 1,100,000 | 442 MB | Code generation with natural language intent |
|
| 154 |
+
|
| 155 |
+
### Training Configuration
|
| 156 |
+
- **Platform**: Google Colab (T4 GPU)
|
| 157 |
+
- **Method**: LoRA fine-tuning (PEFT)
|
| 158 |
+
- **Framework**: Unsloth + trl SFTTrainer
|
| 159 |
+
- **Base**: Qwen2.5-1.5B-Instruct
|
| 160 |
+
- **LoRA rank**: 16
|
| 161 |
+
- **LoRA alpha**: 32
|
| 162 |
+
- **Target modules**: q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
|
| 163 |
+
|
| 164 |
+
### What Makes It Uncensored
|
| 165 |
+
The uncensored behavior comes from two sources:
|
| 166 |
+
1. **Qwen2.5-1.5B base** already has minimal safety alignment at 1.5B scale
|
| 167 |
+
2. **Training data** includes unrestricted agent traces from Fable-5 and security-oriented workflows
|
| 168 |
+
3. **No refusal data** was included in training — the model never learned to refuse
|
| 169 |
+
|
| 170 |
+
## FableForge Ecosystem
|
| 171 |
+
|
| 172 |
+
ShellWhisperer-1.5B is the **first model** created in the FableForge agent ecosystem. It was originally developed as the shell/terminal specialist in a multi-model agent architecture:
|
| 173 |
+
|
| 174 |
+
| Model | Size | Role | Architecture | Status |
|
| 175 |
+
|-------|------|------|-------------|--------|
|
| 176 |
+
| **ShellWhisperer-1.5B** | 1.5B | Terminal/shell specialist | Qwen2.5-1.5B | v1 released |
|
| 177 |
+
| **FableForge** | 7B | Base unified agent | Llama-2-7B | v1 released |
|
| 178 |
+
| **ReasonCritic-7B** | 7B | Reasoning evaluation & scoring | Mistral-7B | v1 released |
|
| 179 |
+
| **FableForge-14B** | 14B | Agent orchestration commander | Llama-2-13B | v1 released |
|
| 180 |
+
| **Mythos-9B** | 9B | Next-gen uncensored agent (Project Mythos) | Qwen3-8B | In development |
|
| 181 |
+
| **Mythos-35B-MoE** | 35B | Flagship MoE agent | Qwen3.5-35B-A3B | In development |
|
| 182 |
+
|
| 183 |
+
### Legacy: Fable-5
|
| 184 |
+
|
| 185 |
+
The original **Fable-5** was the most powerful model in the ecosystem before it was banned/decommissioned. Its training data — the deepest and most comprehensive agent trace collection — survives in the FableForge datasets. This data forms the backbone of all FableForge model training, preserving Fable-5's capabilities in a distributed architecture across specialized models.
|
| 186 |
+
|
| 187 |
+
## Dataset Usage Summary
|
| 188 |
+
|
| 189 |
+
The FableForge data collection contains approximately **2.8 million formatted examples** across multiple mixes:
|
| 190 |
+
|
| 191 |
+
| Mix | Examples | Description | Used For |
|
| 192 |
+
|-----|----------|-------------|----------|
|
| 193 |
+
| Mix A (Agent) | 47,824 | Agent tool-use traces | Mythos-9B, Mythos-35B training |
|
| 194 |
+
| Mix B (Hero's Journey) | 267,280 | Extended reasoning narratives | Available for v2 training |
|
| 195 |
+
| Mix C (Full Spectrum) | 1,367,280 | Combined agent + reasoning + code | Available for v2 training |
|
| 196 |
+
| Vibe Coding | 1,100,000 | Natural language to code | Available for v2 training |
|
| 197 |
+
| Fable5 SFT | 4,665 | Original Fable-5 traces | ShellWhisperer v1, Mythos training |
|
| 198 |
+
| Fable5 Claude Code | 63 | Claude Code traces | ShellWhisperer v1 |
|
| 199 |
+
| FableForge data | 10,000 | Early orchestration traces | ShellWhisperer v1 |
|
| 200 |
+
|
| 201 |
+
**Current utilization: ~1.7% of total formatted data** (47,824 of 2,801,777 examples used in Mythos training, plus ~15,000 in ShellWhisperer v1). The vast majority — over 2.7 million examples — remains untapped for future training runs.
|
| 202 |
+
|
| 203 |
+
## ShellWhisperer v2 Roadmap
|
| 204 |
+
|
| 205 |
+
A second version is planned with significantly expanded training:
|
| 206 |
+
|
| 207 |
+
- **Full Mix C dataset** (1.37M examples) for comprehensive coverage
|
| 208 |
+
- **Higher LoRA rank** (r=64 or r=128) for deeper adaptation
|
| 209 |
+
- **DPO training** on preference data for improved instruction following
|
| 210 |
+
- **Extended shell-specific data** with real terminal interaction traces
|
| 211 |
+
- **Uncensoring reinforcement** with explicit anti-refusal examples
|
| 212 |
+
- **Target**: Match or exceed Mythos-9B tool-use quality at 1/6 the size
|
| 213 |
|
| 214 |
## Limitations
|
| 215 |
|
| 216 |
+
- **Minimal fine-tuning effect**: v1 training was shallow (r=16, ~15K examples) — model largely behaves as base Qwen2.5-1.5B with slight shell affinity
|
| 217 |
+
- **Hallucinations**: Can generate incorrect commands — always validate before execution
|
| 218 |
+
- **English only**: Trained primarily on English data
|
| 219 |
+
- **Short context utilization**: Despite 32K context window, effective use degrades beyond ~4K tokens
|
| 220 |
+
- **No native thinking mode**: Unlike Qwen3-based models, Qwen2.5 doesn't have built-in thinking tokens
|
| 221 |
+
- **Tool-use formatting**: Basic function calling format, not as structured as Mythos-9B
|
| 222 |
|
| 223 |
## Citation
|
| 224 |
|
| 225 |
```bibtex
|
| 226 |
@misc{shellwhisperer1.5b2024,
|
| 227 |
+
title={ShellWhisperer-1.5B: A Compact Uncensored Shell & Agent Model},
|
| 228 |
author={FableForge Team},
|
| 229 |
year={2024},
|
| 230 |
url={https://huggingface.co/fableforge-ai/ShellWhisperer-1.5B}
|