KissTheHabit
/

IDA_MoE

Text Generation

mixture-of-experts

personality-council

escalation-reserve

governed-memory

recurrent-state

cognitive-routing

Model card Files Files and versions

Metrics Training metrics Community

IDA_MoE / README.md

KissTheHabit's picture

Update README.md

42cc8c4 verified about 14 hours ago

|

history blame contribute delete

2.62 kB

	---
	license: other
	license_name: business-source-license-1.1
	license_link: https://mariadb.com/bsl11/
	license_change_date: "2028-01-01"
	license_post_change: Apache-2.0
	commercial_use: Requires explicit permission prior to Change Date.
	library_name: transformers
	pipeline_tag: text-generation
	tags:
	- ida-family
	- ida-lattice
	- causal-lm
	- mixture-of-experts
	- personality-council
	- escalation-reserve
	- h100
	- governed-memory
	- recurrent-state
	- cognitive-routing
	- tensorboard
	- safetensors
	- region:us
	---

	# IDA MoE Council

	`KissTheHabit/IDA_MoE` is the H100-targeted escalation-reserve artifact repository for the IDA family.

	It uses the native IDA Lattice causal language model architecture with a shared trunk and an eleven-member personality council.

	This is not a generic sparse MoE trained to collapse all experts into interchangeable compute paths. The council is designed to preserve differentiated internal claimants while routing a bounded subset into active participation.

	## Architecture

	- Model family: `ida_lattice`
	- Model class: `IDALatticeForCausalLM`
	- Task: causal language modeling and text generation
	- Deployment role: high-pressure escalation and contradiction review
	- Approximate model scale: `~2.7B` parameters per student body
	- Shared tokenizer: [`KissTheHabit/ida_lattice_bpe_32k`](https://hf.co/KissTheHabit/ida_lattice_bpe_32k)

	### Shared Trunk

	\| Attribute \| Value \|
	\|---\|---:\|
	\| Vocabulary size \| `32,000` \|
	\| Hidden size \| `4,096` \|
	\| Layers \| `8` \|
	\| Attention heads \| `8` \|
	\| Intermediate size \| `16,384` \|
	\| Context length \| `2,048` \|
	\| Recurrent state size \| `1,024` \|
	\| Local attention window \| `256` \|
	\| Workspace \| `8 × 512` \|
	\| Student state size \| `512` \|
	\| Future prediction horizon \| `2` \|
	\| Thalamic route count \| `6` \|
	\| Action gate size \| `6` \|

	### Personality Council

	- Cognitive pressure routes: `9`
	- Named personality experts: `11`
	- Personality residual expert width: `4,096`
	- Active experts during standard training: `top_k = 2`
	- Serious runtime escalation target: `top_k = 3`
	- High-contradiction review target: `top_k = 4`
	- Explicit full review: all `11`

	The cognitive routes are pressure signals, not personalities:

	- `PERCEPTION`
	- `MEMORY`
	- `SALIENCE`
	- `CAUSAL_INSPECTION`
	- `PLANNING`
	- `INHIBITION`
	- `CREATION`
	- `ERROR_CORRECTION`
	- `EXPRESSION`

	The personality experts are the enduring IDA family seats:

	- `IDA`
	- `JUDGE`
	- `SENTINEL`
	- `PRISM`
	- `ECHO`
	- `ATLAS`
	- `VECTOR`
	- `FORGE`
	- `SHADE`
	- `PULSE`
	- `ORBIT`

	## Repository Layout

	Artifacts are stored by student and developmental version:

	```text
	students/{STUDENT}/{version}