Instructions to use throsturx/bihmoe-poc with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use throsturx/bihmoe-poc with Transformers:
# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("throsturx/bihmoe-poc", dtype="auto") - Notebooks
- Google Colab
- Kaggle
dualbrain-bihmoe-poc
Minimal PoC for bilateral hierarchical MoE with reconciliation. Goal: determine if structure yields OOD/generalization signal vs dense compute-matched baseline.
Protocol:
- Deterministic task generation (seeded).
- Side-by-side training (structured S vs dense D_a) from the same data stream.
- Eval every N steps on fixed IID/OOD/structure-break sets.
- Early-kill criteria documented in docs/00_north_star.md.
Reproduce (twohop_bind, Schedule-2+Ramp+BraidMix @ 4k)
Environment (Arch, system torch-cuda; venv uses system site-packages):
uv venv --python /usr/bin/python --system-site-packages
uv sync
Run 3-seed panel with live KEYLINE progress:
scripts/run_panel_4k.sh configs/poc_twohop_sched2_ramp_braid.yaml
Extract KEYLINEs from logs:
scripts/extract_keylines.sh /tmp/bihmoe_s11_4k_*.log
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support