--- language: en license: mit tags: - pytorch - mixture-of-experts - language-model - reasoning - grpo --- # SHOREKEEPER-4B A 4-billion parameter language model built around a **Council of Experts** architecture — 12 specialized expert modules routed by a learned gating network, layered on top of 28 transformer blocks with Grouped Query Attention and RoPE positional encoding. Designed for reasoning, code generation, and long-term memory across conversations. --- ## Architecture | Component | Details | |---|---| | Parameters | ~4B | | Layers | 28 transformer blocks | | Attention | Grouped Query Attention (24 heads, 6 KV heads, head_dim 128) | | Positional encoding | RoPE (θ = 1,000,000) | | Experts | 12 specialists, 2 activated per token | | Expert routing | Sentinel (learned gating with load-balance loss) | | Expert dim | 2048 | | Hidden dim | 3072 | | Vocab size | 50,304 | | Max sequence length | 8,192 | | Quantization | 4-bit NF4 (bitsandbytes) | Each transformer block applies **attention → MoE FFN** with pre-norm and residual connections. The 12 experts share weights across layers (cross-layer parameter sharing), keeping the model compact while preserving specialization. --- ## The Council of Experts The Sentinel router selects 2 experts per token based on learned routing logits. Each expert is a gated feed-forward network (SiLU gate × value projection) with a role-specific bias term. | Expert | Role | Specialization | |---|---|---| | **Asmoday** | Code | Python development, debugging | | **Istaroth** | Systems | OS, networking, deployment | | **Ronova** | Reasoning | Math, logic, step-by-step problems | | **Naberius** | Memory | Long-term retrieval | | **Phanes** | Creation | Writing, generation | | **Barbeloth** | Analysis | Data patterns, insights | | **Tacet** | Silence | Noise filtering, summarization | | **Abby** | Empathy | User context, preferences | | **Reindoter** | Validation | Testing, verification | | **Zestial** | Vision | Visualization, diagrams | | **Alice** | Exploration | Novel solutions, experiments | | **Rover** | Execution | Terminal commands, sandbox | --- ## Persistent Memory SHOREKEEPER maintains a JSON-based memory store across conversations, organized into six categories: - `user_preferences` — learned user settings and habits - `project_context` — active project information - `conversation_history` — past exchanges (capped at 1,000 entries per category) - `important_facts` — stored knowledge - `code_patterns` — learned code conventions - `learned_skills` — acquired capabilities Memory context is automatically injected into each `chat()` call. Use `/remember` and `/recall` commands to interact with it directly. --- ## Training Training happens in two stages: **Stage 1 — Supervised Fine-Tuning** Mixed STEM dataset: GSM8K, CodeAlpaca, OpenOrca, MathInstruct (~50K examples). Standard causal language modeling loss with AdamW + cosine annealing. **Stage 2 — GRPO** Group Relative Policy Optimization on math reasoning prompts. Reward signal: +2.0 for correct answer, +0.5 bonus for chain-of-thought reasoning steps. Load balance loss applied every step to prevent expert collapse. --- ## Sandboxed Execution SHOREKEEPER can execute terminal commands inside a Docker container with: - Command whitelist (python3, pip, git, ls, cat, mkdir, touch, echo) - 30-second timeout - 4GB memory / 2 CPU limit - No interactive shell access --- ## Quick Start ```bash pip install -r requirements.txt python scripts/07_run_shorekeeper.py ``` **Available commands in the CLI:** ``` /remember Store something in long-term memory /recall Search memory /run Execute in sandbox /project Create a new project /exit Quit ``` --- ## Project Structure ``` src/ ├── shorekeeper.py Main model class ├── council/ │ ├── attention.py GQA + RoPE attention layer │ ├── sentinel.py Expert router │ ├── experts.py 12 expert modules │ └── base_expert.py Shared expert base class ├── memory/ │ └── json_library.py Persistent memory system ├── sandbox/ │ └── terminal.py Docker-based execution └── training/ └── grpo.py GRPO trainer configs/ YAML configs (model, training, memory, sandbox) scripts/ Training and inference scripts tests/ Unit tests ``` --- ## Requirements - Python 3.10+ - PyTorch 2.5+ - CUDA recommended for inference at full precision - Docker (optional, for sandbox execution) ```bash pip install -r requirements.txt ``` --- ## Variants A **15B variant** config is available at `configs/model_15b.yaml` (dim 6144, 48 layers, 16 experts).