YAML Metadata Warning:empty or missing yaml metadata in repo card
Check out the documentation for more information.
OO β Operating Organism | Bare-Metal LLM/SSM Inference
The world's first bare-metal UEFI LLM + SSM inference engine β no OS, no libc, no malloc.
Runs Mamba-2.8B (SSM) and LLaMA2 (Transformer) directly on x86_64 hardware via UEFI.
What's Inside
| File | Description | Size |
|---|---|---|
oo-usb-v3-mamba2.8b-x86_64.img.xz |
USB boot image β Mamba-2.8B + GPT-NeoX tokenizer + REPL | ~1.5 GB |
llm-baremetal-boot-x86_64.img.xz |
QEMU boot image β stories15M + REPL (no Mamba weights) | ~11 MB |
gpt_neox_tokenizer.bin |
GPT-NeoX BPE tokenizer (50,254 tokens) for Mamba-2.8B | 715 KB |
KERNEL.EFI |
The bare-metal EFI binary (standalone) | ~27 MB |
First Real Inference (Mamba-2.8B, bare-metal)
OO> /ssm_infer The meaning of life is
The meaning of life is not a question of what we do, but of what
we are. We are not merely the product of our genes, but of our
choices.
[OOSI-v3] 64 tokens in 48291 ms (1.3 tok/s)
No OS. No kernel. No libc. Just UEFI + custom allocator + SSM math.
Interactive REPL Commands
/ssm_load <file> β Load Mamba model (OOSS v3 format)
/ssm_infer <prompt> β Run SSM inference
/ssm_params β Show sampling config
/ssm_selftest β Tokenizer + model + pipeline verification
/temp <0.0-2.0> β Set temperature
/top_p <0.0-1.0> β Set nucleus sampling threshold
/rep_penalty <f> β Set repetition penalty
/max_tokens <n> β Set max generation tokens
/seed <n> β Set RNG seed (reproducible output)
/verbose 0|1|2 β Set debug verbosity
/help β Full command list
Architecture
βββββββββββββββββββββββββββββββββββββββββββββββ
β UEFI Firmware (OVMF / real hardware) β
βββββββββββββββββββββββββββββββββββββββββββββββ€
β OO Kernel (zones allocator, sentinel, D+) β
ββββββββββββββββββββ¬βββββββββββββββββββββββββββ€
β LLaMA2 Engine β Mamba SSM Engine β
β (GGUF Q4/Q8) β (OOSS v3, f32) β
ββββββββββββββββββββ΄βββββββββββββββββββββββββββ€
β BPE Tokenizer (GPT-NeoX / SentencePiece) β
βββββββββββββββββββββββββββββββββββββββββββββββ€
β Interactive REPL (UTF-8, autorun, journal) β
βββββββββββββββββββββββββββββββββββββββββββββββ
- Memory: Custom zone allocator (COLD/WARM/HOT), no malloc
- SSM: Selective State Space Model with precomputed exp(A_log)
- Sampling: Temperature + top-p nucleus + repetition penalty
- Safety: D+ policy engine, Sentinel warden, OO journal
Usage
QEMU (quick test, no Mamba weights)
# Extract
xz -d llm-baremetal-boot-x86_64.img.xz
# Boot
qemu-system-x86_64 -drive if=pflash,format=raw,readonly=on,file=edk2-x86_64-code.fd \
-drive format=raw,file=llm-baremetal-boot.img -m 2048M -cpu max -accel tcg
Real Hardware (Mamba-2.8B)
# Extract USB image
xz -d oo-usb-v3-mamba2.8b-x86_64.img.xz
# Flash with Rufus (GPT, UEFI non-CSM)
# Boot on any UEFI x86_64 machine (8GB+ RAM recommended)
Technical Details
- Target: x86_64 UEFI (PE/COFF EFI application)
- Language: Freestanding C (no stdlib, no libc)
- Models: Mamba-2.8B (2.7GB, f32), stories15M (Q8_0)
- Tokenizer: GPT-NeoX 50K BPE (byte-level, GPT-2 compatible)
- Build: GNU-EFI toolchain on Linux/WSL
Source Code
- GitHub: github.com/Djiby-diop/llm-baremetal
- Organization: github.com/LLM-Baremetal
Author
Djiby Diop β Dakar, Senegal
OO is a research project exploring the frontier of bare-metal AI inference β running neural networks on raw hardware without any operating system layer.
Inference Providers NEW
This model isn't deployed by any Inference Provider. π Ask for provider support