KrystalMind Cortex (0.6B JSON Planner)
Structured JSON action planner for the L4 intelligence pipeline. Routes divergence signals to appropriate data sources.
Architecture
- Base: Qwen3-0.6B-Instruct (28 layers, d_model=1024, 16Q/8KV heads)
- LoRA SFT: rank=16, scale=1.0, 1000 iters on routing_sft_v1 (14,836 examples)
- Output: Structured JSON plan (db_queries, ecs_metrics, ssm_context, simulation)
- Decoding: Greedy (T=0.0) with JSON-aware brace-depth stop
Files
krystalball_cortex.kbโ Baked production model (571 MB, FP32)fused/model.safetensorsโ Full fused Qwen3-0.6B + LoRA (1.1 GB)adapters/adapters.safetensorsโ LoRA adapters only (10 MB)tokenizer.jsonโ Qwen3 tokenizer
Performance
- INT8: ~86 tok/s, BF16: ~53 tok/s
- Typical plan: 150 tokens greedy @ ~1.75s
- 3-tier parse robustness: Direct JSON โ repair โ deterministic fallback
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐ Ask for provider support