Doom on ONNX

A single self-contained ONNX model (doom.onnx, ~8.4 MB) that, when run on any ONNX Runtime CPU EP, boots and renders the original 1993 Doom. No custom operators, no execution-provider plugins, no Python in the loop — just standard ONNX ops (Add, BitwiseAnd, Where, Gather, ScatterElements, Loop, If, …) executing inside a single InferenceSession.run call.

The model contains:

an RV32IM CPU built entirely out of ONNX operators,
the doom1.wad shareware game data as a read-only initializer,
the doomgeneric Doom source cross-compiled to bare-metal RV32IM and baked into RAM as another initializer.

Reference render

The doom.gif in this repo was assembled from 74 PNG frames captured during a single InferenceSession.run invocation:

Total: 80,000,000 RV32IM instructions, 10.8 hours wall time
Rate: 1,562 IPS (init code) → 2,053 IPS (in-game rendering)
Reached: title wipe → menu → DEMO1 load → game logic → 3D BSP rendering of actual gameplay (frames 54–75)

Performance

~2,000 simulated RV32IM instructions per second on a modern laptop CPU. This is not a real-time emulator. One frame every ~9 minutes is the reality on a single CPU thread. See PERF_INVESTIGATION.md in the source repo for the full investigation (TL;DR: ORT's MayInplace alias doesn't apply to Loop-carried state, so the 8 MiB RAM gets fully copied per ScatterElements).

Running

import numpy as np
import onnxruntime as rt

sess = rt.InferenceSession("doom.onnx", providers=["CPUExecutionProvider"])

RAM_SIZE = 8 * 1024 * 1024
ram = np.load("initial_ram.npy")  # Doom ELF baked into RAM
pc  = np.array(0x1000, dtype=np.int32)
regs = np.zeros(32, dtype=np.int32)

MMIO_TICK = RAM_SIZE - 16
sim_ms = 0
for chunk in range(250):
    ram[MMIO_TICK:MMIO_TICK + 4] = np.frombuffer(
        np.uint32(sim_ms).tobytes(), dtype=np.uint8)
    sim_ms += 100
    pc, regs, ram = sess.run(None, {
        "pc_in": pc, "regs_in": regs, "ram_in": ram,
        "trip_count": np.array(100_000, dtype=np.int64),
    })
    # framebuffer at RAM_SIZE - 32 - 64000, 320×200 palette indices

The host only writes a millisecond counter into the MMIO tick register between chunks and reads the framebuffer out of the returned ram_out.

Inputs / outputs

Name	Type	Shape	Role
`pc_in`	int32	scalar	program counter
`regs_in`	int32	[32]	x0..x31 (x0 forced to 0)
`ram_in`	uint8	[8 MiB]	full writable memory
`trip_count`	int64	scalar	how many insts to execute
`pc_out` / `regs_out` / `ram_out`	(same types)		post-state

rom (the WAD) is a read-only initializer baked into the model.

License

The CPU + DMA glue is MIT. doomgeneric and the original Doom source are GPL-2.0; the shareware doom1.wad ships under id Software's shareware terms. This model card and model file inherit GPL-2.0.

Acknowledgements

id Software for releasing Doom's source. The doomgeneric project for the platform-agnostic port. The ONNX team for an op set this absurdly expressive.

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

Other

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support