How to use from the
Use from the
llama-cpp-python library
# !pip install llama-cpp-python

from llama_cpp import Llama

llm = Llama.from_pretrained(
	repo_id="clemsail/micro-kiki-v3",
	filename="micro-kiki-v3-Q4_K_M.gguf",
)
llm.create_chat_completion(
	messages = [
		{
			"role": "user",
			"content": "What is the capital of France?"
		}
	]
)

micro-kiki

35-domain expert model built on Qwen3.5-35B-A3B (MoE, 256 experts, 3B active/token) with LoRA adapters and a cognitive layer (memory palace + negotiator + anti-bias).

Model Description

micro-kiki is a multi-domain language model designed for technical applications spanning electronics, firmware, CAD, manufacturing, and general-purpose conversation. It uses a router-based architecture that selects up to 4 domain-specific LoRA stacks per request.

Property Value
Base model Qwen3.5-35B-A3B
Architecture MoE (256 experts, 3B active/token)
Adapter LoRA rank 16 (q/k/v/o projections)
Domains 35
Max active stacks 4
Context length 262,144 tokens
Quantization Q4_K_M (inference), BF16 (training)
License Apache 2.0

Architecture

                         +-------------------+
                         |   Domain Router   |
                         | (classifier, top4)|
                         +--------+----------+
                                  |
              +----------+--------+--------+----------+
              |          |                 |          |
         +----v----+ +---v---+       +----v----+ +---v---+
         | Stack 1 | |Stack 2|  ...  |Stack 34 | |Stack35|
         | chat-fr | |python |       |ml-train | |securi.|
         +---------+ +-------+       +---------+ +-------+
              |          |                 |          |
              +----------+--------+--------+----------+
                                  |
                         +--------v----------+
                         |    Negotiator     |
                         | CAMP + Catfish    |
                         +--------+----------+
                                  |
                         +--------v----------+
                         |    Anti-Bias      |
                         | KnowBias + RBD   |
                         +--------+----------+
                                  |
                         +--------v----------+
                         |   Aeon Memory     |
                         | Atlas + Trace     |
                         +-------------------+

Intended Use

  • French/English conversational AI with domain expertise
  • Code generation (Python, C/C++, Rust, TypeScript, embedded firmware)
  • Electronics design (KiCad DSL, schematic review, component selection, SPICE)
  • Manufacturing (process optimization, quality control)
  • Multi-domain routing with cognitive arbitration

Limitations

  • Not designed for medical, legal, or financial advice
  • Optimized for technical domains; general knowledge may be weaker than base model
  • Requires Q4_K_M or higher quantization; quality degrades below Q4
  • Maximum 4 concurrent LoRA stacks; performance varies with stack combinations
  • Memory (Aeon) requires external backends (Qdrant/Neo4j) for production use

Training Data — V3 (489K examples, 35 domains)

Sources

Source Examples Description
Claude CLI sessions 50,116 Real user-tool interactions extracted from 5 machines (GrosMac, kxkm-ai, Studio, Tower, CILS)
Codex/Copilot sessions 2,529 OpenAI Codex + GitHub Copilot sessions extracted from 4 machines
HuggingFace datasets 364,045 19 open datasets (see below)
Opus teacher distillation chat-fr, reasoning domains
Original curated 32 domain seed datasets

HuggingFace Datasets

Dataset Examples License
CodeFeedback-Filtered-Instruction 157,000 Apache 2.0
French-Alpaca-Instruct-110K 110,000 Apache 2.0
Electronics StackExchange 95,000 CC-BY-SA-3.0
CJJones/LLM_EE_Educational_Synthetic_Dialog 50,000 CC-BY-NC-SA-4.0
MuratKomurcu/stm32-hal-dataset 29,700 MIT
redcathode/thingiverse-openscad 7,400
ThomasTheMaker/OpenSCAD 4,900
STEM-AI-mtl/Electrical-engineering 1,100
JITX open-components-database 151
Vrindarani/netlistgen 106

35 Domains

Group Domains
Conversation chat-fr, reasoning
Code python, typescript, cpp, rust, html-css, shell, sql, yaml-json, lua-upy
Infrastructure docker, devops, llm-orch, llm-ops (NEW), ml-training (NEW)
Electronics kicad-dsl, kicad-pcb, spice, electronics, components (NEW), power, emc, dsp
Hardware embedded, stm32, iot, platformio
CAD freecad
Web web-frontend, web-backend
Other music-audio, math, security

Changes from V2: 3 new domains (components, llm-ops, ml-training). spice-sim merged into spice. stm32 is a sub-category of embedded.

New Domain: components

57K Q&A about electronic component specs, datasheets, sourcing, BOM, and cross-reference. Sources: Electronics StackExchange (filtered by component tags) + JITX open-components-database.

Training — V3

Property Value
Base model Qwen3.5-4B
Adapter MoE-LoRA: 4 experts/projection, rank 16, top-2 routing
Null-space projection ENABLED (prevents catastrophic forgetting between stacks)
Curriculum Sequential, 35 stacks trained in order
Platform (MLX) Mac Studio M3 Ultra 512 GB
Platform (CUDA) kxkm-ai RTX 4090 24 GB

Evaluation

Metric Value
Router accuracy (35-class) [PENDING]
Forgetting check (angle) [PENDING]
Perplexity (base) [PENDING]
Perplexity (debiased) [PENDING]
Aeon recall@1 [PENDING]
Aeon recall@5 [PENDING]
Aeon recall@10 [PENDING]
Anti-bias flag rate [PENDING]
Average inference latency [PENDING]

Hardware Requirements

Setup RAM/VRAM Use
Mac Studio M3 Ultra 512 GB unified Training (BF16 LoRA) + serving (MLX)
RTX 4090 24 GB VRAM Q4 inference (vLLM)
Apple Silicon 32 GB+ 32 GB unified Q4_K_M inference (MLX/llama.cpp)

Citation

@misc{micro-kiki-2026,
  title={micro-kiki: Multi-Domain Expert Model with Cognitive Layer},
  author={L'Electron Rare},
  year={2026},
  url={https://huggingface.co/electron-rare/micro-kiki}
}

Related Projects & Ecosystem

micro-kiki-v3 is one component of the FineFab platform built by L'Électron Rare — a local-first, multi-machine AI-native manufacturing and electronics platform.

Role Project Description
Training toolkit L-electron-Rare/KIKI-Mac_tunner MLX fine-tuning toolkit (Mac Studio) — Opus reasoning distilled into Mistral Large 123B
Fine-tuning pipeline L-electron-Rare/KIKI-models-tuning FineFab fine-tuning pipeline — training, evaluation, registry (Unsloth, LoRA)
Methodology electron-rare/Kill_LIFE Spec-first agentic methodology for embedded systems — BMAD agents, gates, evidence packs
Orchestration electron-rare/mascarade Multi-machine agentic LLM orchestration — P2P mesh, 8 providers, RAG pipeline
AI backend L-electron-Rare/life-core FineFab AI backend — LLM router, RAG, caching, orchestration
CAD assistant electron-rare/KiC-AI AI-powered PCB design assistant for KiCad

See the full org at github.com/L-electron-Rare — 13 public repos covering platform, hardware, firmware, CAD, and ML.

Infrastructure: the 50K+ Claude CLI examples in the training dataset were captured on our 5-node P2P mesh — GrosMac (Apple M5), Tower (28 threads), CILS (i7), KXKM-AI (RTX 4090), VM bootstrap. Ed25519 auth, DHT discovery.

🇪🇺 EU AI Act transparency

This adapter is provided as a fine-tuned LoRA under the AI Act framework (Regulation EU 2024/1689). Compliance metadata:

Field Value
Provider L'Électron Rare (clemsail / electron-rare)
Role under AI Act GPAI provider for this adapter
Base model Qwen/Qwen3.5-35B-A3B — see upstream provenance
Adapter type LoRA / PEFT — adapter weights only; base unchanged
Training data origin L'Électron Rare proprietary technical corpus + curated public docs
License Apache-2.0 (adapter). Upstream base licence applies separately.
Intended use Multi-domain technical assistance — engineering, KiCad, embedded, code, FR/EN chat
Out of scope Healthcare diagnosis, legal advice, autonomous safety-critical decisions, generation of malicious code
Risk classification Limited risk — Article 50 transparency obligations apply
Copyright respect Training data does not include scraped copyrighted material. Opt-out signals (robots.txt, ai.txt) are honoured for web-sourced data.
Full provenance https://github.com/L-electron-Rare/eu-kiki/tree/main/docs/provenance
Contact postmaster@saillant.cc — biased output reports, copyright concerns, etc.

⚠️ You are using an AI model. Outputs may be inaccurate, biased or fabricated. Do not act on them without independent verification, especially in regulated domains.

Downloads last month
245
GGUF
Model size
4B params
Architecture
qwen35
Hardware compatibility
Log In to add your hardware

4-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for clemsail/micro-kiki-v3

Adapter
(24)
this model