Text Generation
Transformers
TensorBoard
Safetensors
ida-family
ida-lattice
causal-lm
mixture-of-experts
personality-council
escalation-reserve
h100
governed-memory
recurrent-state
cognitive-routing
Instructions to use KissTheHabit/IDA_MoE with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use KissTheHabit/IDA_MoE with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="KissTheHabit/IDA_MoE")# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("KissTheHabit/IDA_MoE", dtype="auto") - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- vLLM
How to use KissTheHabit/IDA_MoE with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "KissTheHabit/IDA_MoE" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "KissTheHabit/IDA_MoE", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker
docker model run hf.co/KissTheHabit/IDA_MoE
- SGLang
How to use KissTheHabit/IDA_MoE with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "KissTheHabit/IDA_MoE" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "KissTheHabit/IDA_MoE", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "KissTheHabit/IDA_MoE" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "KissTheHabit/IDA_MoE", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }' - Docker Model Runner
How to use KissTheHabit/IDA_MoE with Docker Model Runner:
docker model run hf.co/KissTheHabit/IDA_MoE
| license: other | |
| license_name: business-source-license-1.1 | |
| license_link: https://mariadb.com/bsl11/ | |
| license_change_date: "2028-01-01" | |
| license_post_change: Apache-2.0 | |
| commercial_use: Requires explicit permission prior to Change Date. | |
| library_name: transformers | |
| pipeline_tag: text-generation | |
| tags: | |
| - ida-family | |
| - ida-lattice | |
| - causal-lm | |
| - mixture-of-experts | |
| - personality-council | |
| - escalation-reserve | |
| - h100 | |
| - governed-memory | |
| - recurrent-state | |
| - cognitive-routing | |
| - tensorboard | |
| - safetensors | |
| - region:us | |
| # IDA MoE Council | |
| `KissTheHabit/IDA_MoE` is the H100-targeted escalation-reserve artifact repository for the IDA family. | |
| It uses the native **IDA Lattice** causal language model architecture with a shared trunk and an eleven-member personality council. | |
| This is not a generic sparse MoE trained to collapse all experts into interchangeable compute paths. The council is designed to preserve differentiated internal claimants while routing a bounded subset into active participation. | |
| ## Architecture | |
| - Model family: `ida_lattice` | |
| - Model class: `IDALatticeForCausalLM` | |
| - Task: causal language modeling and text generation | |
| - Deployment role: high-pressure escalation and contradiction review | |
| - Approximate model scale: `~2.7B` parameters per student body | |
| - Shared tokenizer: [`KissTheHabit/ida_lattice_bpe_32k`](https://hf.co/KissTheHabit/ida_lattice_bpe_32k) | |
| ### Shared Trunk | |
| | Attribute | Value | | |
| |---|---:| | |
| | Vocabulary size | `32,000` | | |
| | Hidden size | `4,096` | | |
| | Layers | `8` | | |
| | Attention heads | `8` | | |
| | Intermediate size | `16,384` | | |
| | Context length | `2,048` | | |
| | Recurrent state size | `1,024` | | |
| | Local attention window | `256` | | |
| | Workspace | `8 × 512` | | |
| | Student state size | `512` | | |
| | Future prediction horizon | `2` | | |
| | Thalamic route count | `6` | | |
| | Action gate size | `6` | | |
| ### Personality Council | |
| - Cognitive pressure routes: `9` | |
| - Named personality experts: `11` | |
| - Personality residual expert width: `4,096` | |
| - Active experts during standard training: `top_k = 2` | |
| - Serious runtime escalation target: `top_k = 3` | |
| - High-contradiction review target: `top_k = 4` | |
| - Explicit full review: all `11` | |
| The cognitive routes are pressure signals, not personalities: | |
| - `PERCEPTION` | |
| - `MEMORY` | |
| - `SALIENCE` | |
| - `CAUSAL_INSPECTION` | |
| - `PLANNING` | |
| - `INHIBITION` | |
| - `CREATION` | |
| - `ERROR_CORRECTION` | |
| - `EXPRESSION` | |
| The personality experts are the enduring IDA family seats: | |
| - `IDA` | |
| - `JUDGE` | |
| - `SENTINEL` | |
| - `PRISM` | |
| - `ECHO` | |
| - `ATLAS` | |
| - `VECTOR` | |
| - `FORGE` | |
| - `SHADE` | |
| - `PULSE` | |
| - `ORBIT` | |
| ## Repository Layout | |
| Artifacts are stored by student and developmental version: | |
| ```text | |
| students/{STUDENT}/{version} |