OO — Operating Organism | Bare-Metal LLM/SSM Inference

The world's first bare-metal UEFI LLM + SSM inference engine — no OS, no libc, no malloc.

Runs Mamba-2.8B (SSM) and LLaMA2 (Transformer) directly on x86_64 hardware via UEFI.

What's Inside

File	Description	Size
`oo-usb-v3-mamba2.8b-x86_64.img.xz`	USB boot image — Mamba-2.8B + GPT-NeoX tokenizer + REPL	~1.5 GB
`llm-baremetal-boot-x86_64.img.xz`	QEMU boot image — stories15M + REPL (no Mamba weights)	~11 MB
`gpt_neox_tokenizer.bin`	GPT-NeoX BPE tokenizer (50,254 tokens) for Mamba-2.8B	715 KB
`KERNEL.EFI`	The bare-metal EFI binary (standalone)	~27 MB

First Real Inference (Mamba-2.8B, bare-metal)

OO> /ssm_infer The meaning of life is
The meaning of life is not a question of what we do, but of what
we are. We are not merely the product of our genes, but of our
choices.
[OOSI-v3] 64 tokens in 48291 ms (1.3 tok/s)

No OS. No kernel. No libc. Just UEFI + custom allocator + SSM math.

Interactive REPL Commands

/ssm_load <file>     — Load Mamba model (OOSS v3 format)
/ssm_infer <prompt>  — Run SSM inference
/ssm_params          — Show sampling config
/ssm_selftest        — Tokenizer + model + pipeline verification
/temp <0.0-2.0>      — Set temperature
/top_p <0.0-1.0>     — Set nucleus sampling threshold
/rep_penalty <f>     — Set repetition penalty
/max_tokens <n>      — Set max generation tokens
/seed <n>            — Set RNG seed (reproducible output)
/verbose 0|1|2       — Set debug verbosity
/help                — Full command list

Architecture

┌─────────────────────────────────────────────┐
│  UEFI Firmware (OVMF / real hardware)       │
├─────────────────────────────────────────────┤
│  OO Kernel (zones allocator, sentinel, D+)  │
├──────────────────┬──────────────────────────┤
│  LLaMA2 Engine   │  Mamba SSM Engine        │
│  (GGUF Q4/Q8)    │  (OOSS v3, f32)          │
├──────────────────┴──────────────────────────┤
│  BPE Tokenizer (GPT-NeoX / SentencePiece)  │
├─────────────────────────────────────────────┤
│  Interactive REPL (UTF-8, autorun, journal) │
└─────────────────────────────────────────────┘

Memory: Custom zone allocator (COLD/WARM/HOT), no malloc
SSM: Selective State Space Model with precomputed exp(A_log)
Sampling: Temperature + top-p nucleus + repetition penalty
Safety: D+ policy engine, Sentinel warden, OO journal

Usage

QEMU (quick test, no Mamba weights)

# Extract
xz -d llm-baremetal-boot-x86_64.img.xz
# Boot
qemu-system-x86_64 -drive if=pflash,format=raw,readonly=on,file=edk2-x86_64-code.fd \
  -drive format=raw,file=llm-baremetal-boot.img -m 2048M -cpu max -accel tcg

Real Hardware (Mamba-2.8B)

# Extract USB image
xz -d oo-usb-v3-mamba2.8b-x86_64.img.xz
# Flash with Rufus (GPT, UEFI non-CSM)
# Boot on any UEFI x86_64 machine (8GB+ RAM recommended)

Technical Details

Target: x86_64 UEFI (PE/COFF EFI application)
Language: Freestanding C (no stdlib, no libc)
Models: Mamba-2.8B (2.7GB, f32), stories15M (Q8_0)
Tokenizer: GPT-NeoX 50K BPE (byte-level, GPT-2 compatible)
Build: GNU-EFI toolchain on Linux/WSL

Source Code

GitHub: github.com/Djiby-diop/llm-baremetal
Organization: github.com/LLM-Baremetal

Author

Djiby Diop — Dakar, Senegal

OO is a research project exploring the frontier of bare-metal AI inference — running neural networks on raw hardware without any operating system layer.

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support