IDA_MoE / README.md
KissTheHabit's picture
Update README.md
42cc8c4 verified
---
license: other
license_name: business-source-license-1.1
license_link: https://mariadb.com/bsl11/
license_change_date: "2028-01-01"
license_post_change: Apache-2.0
commercial_use: Requires explicit permission prior to Change Date.
library_name: transformers
pipeline_tag: text-generation
tags:
- ida-family
- ida-lattice
- causal-lm
- mixture-of-experts
- personality-council
- escalation-reserve
- h100
- governed-memory
- recurrent-state
- cognitive-routing
- tensorboard
- safetensors
- region:us
---
# IDA MoE Council
`KissTheHabit/IDA_MoE` is the H100-targeted escalation-reserve artifact repository for the IDA family.
It uses the native **IDA Lattice** causal language model architecture with a shared trunk and an eleven-member personality council.
This is not a generic sparse MoE trained to collapse all experts into interchangeable compute paths. The council is designed to preserve differentiated internal claimants while routing a bounded subset into active participation.
## Architecture
- Model family: `ida_lattice`
- Model class: `IDALatticeForCausalLM`
- Task: causal language modeling and text generation
- Deployment role: high-pressure escalation and contradiction review
- Approximate model scale: `~2.7B` parameters per student body
- Shared tokenizer: [`KissTheHabit/ida_lattice_bpe_32k`](https://hf.co/KissTheHabit/ida_lattice_bpe_32k)
### Shared Trunk
| Attribute | Value |
|---|---:|
| Vocabulary size | `32,000` |
| Hidden size | `4,096` |
| Layers | `8` |
| Attention heads | `8` |
| Intermediate size | `16,384` |
| Context length | `2,048` |
| Recurrent state size | `1,024` |
| Local attention window | `256` |
| Workspace | `8 × 512` |
| Student state size | `512` |
| Future prediction horizon | `2` |
| Thalamic route count | `6` |
| Action gate size | `6` |
### Personality Council
- Cognitive pressure routes: `9`
- Named personality experts: `11`
- Personality residual expert width: `4,096`
- Active experts during standard training: `top_k = 2`
- Serious runtime escalation target: `top_k = 3`
- High-contradiction review target: `top_k = 4`
- Explicit full review: all `11`
The cognitive routes are pressure signals, not personalities:
- `PERCEPTION`
- `MEMORY`
- `SALIENCE`
- `CAUSAL_INSPECTION`
- `PLANNING`
- `INHIBITION`
- `CREATION`
- `ERROR_CORRECTION`
- `EXPRESSION`
The personality experts are the enduring IDA family seats:
- `IDA`
- `JUDGE`
- `SENTINEL`
- `PRISM`
- `ECHO`
- `ATLAS`
- `VECTOR`
- `FORGE`
- `SHADE`
- `PULSE`
- `ORBIT`
## Repository Layout
Artifacts are stored by student and developmental version:
```text
students/{STUDENT}/{version}