Spaces:
Running
Running
metadata
title: MonarchSLM
emoji: π
colorFrom: yellow
colorTo: purple
sdk: docker
app_port: 7860
pinned: false
license: mit
tags:
- julia
- lux
- slm
- philosophy
- openai-compatible
- bpe
- monarch-mixer
- rmsnorm
- swiglu
MonarchSLM
A Monarch Mixer decoder-only model (sub-quadratic sequence mixing, RMSNorm, SwiGLU) trained on classical philosophy texts, implemented in Julia with Lux.jl. Serves an OpenAI-compatible API with streaming support.
Endpoints
GET /β Health check and model infoGET /v1/modelsβ List available modelsPOST /v1/chat/completionsβ Generate text (supports streaming, top-k, top-p)
Usage
# Non-streaming
curl -X POST https://your-space.hf.space/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{"messages": [{"role": "user", "content": "the nature of"}], "max_tokens": 200}'
# Streaming
curl -X POST https://your-space.hf.space/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{"messages": [{"role": "user", "content": "the nature of"}], "stream": true, "temperature": 0.7, "top_k": 40}'
Architecture
- Model: ~5M params, 256d embed, 8 layers, 8 Monarch heads
- Sequence mixing: Multi-head Monarch Matrix (sub-quadratic) + Causal Depthwise Conv + Learned Gate
- Tokenizer: BPE (2000 tokens)
- Framework: Lux.jl (explicit parameter/state management)
- Normalization: RMSNorm (pre-norm)
- Feed-forward: SwiGLU activation
- Weight tying: Shared embedding/output projection
- Inference: CPU-only, no Lux dependency at runtime (pure NNlib)
Environment Variables
HF_REPOβ HuggingFace model repo (default:LisaMegaWatts/MonarchSLM)PORTβ Server port (default:7860)