Querying a Neural Network Like a Database — and Making That Tool Available to Everyone

Community Article Published April 15, 2026

A day ago, researcher Chris Hayuk published something I hadn't seen before. Not a paper. Not a benchmark. A working tool that lets you treat a transformer model the way you'd treat a relational database — with a query language, structured output, and direct access to the weights.

1776161772657

It's called LARQL. I want to explain what it does, why I think it matters, and what I contributed to make it more accessible. What LARQL actually is

Most interpretability tools work from the outside. You send a prompt, watch the output, and try to reason backwards about what happened in between. LARQL works from the inside.

The first step is extraction. You run a tool called larql extract on any HuggingFace transformer model, and it produces a vindex — a "vector index" — a structured, queryable representation of the model's FFN (feed-forward network) layers. The vindex captures, for each feature in each layer: what tokens activate it, what tokens it predicts downstream, and how strongly it fires under different inputs.

The vindex is not the model's weights in raw form. It's the model's weights organised as knowledge — as a graph you can walk.

The second step is querying it with LQL (Lazarus Query Language). Some examples of what that looks like:

-- Which features fire most strongly at layers 20-23 for this prompt? WALK "The capital of France is" TOP 10;

-- Compare three prompts side by side at the same layer -- to see which features are shared vs. concept-specific PROBE "France is" vs "Germany is" vs "Japan is" AT LAYER 22;

-- Predict what the model thinks comes next INFER "The capital of France is" TOP 5;

-- Edit the model's knowledge directly INSERT INTO EDGES (entity, relation, target) VALUES ("Atlantis", "capital-of", "Poseidon");

That last one is not a fine-tune. It's a direct write to the vindex — editing what the model knows at the weight level. This is mechanistic interpretability with teeth. Why the original was hard to reach

Chris built this on macOS with Apple Silicon. That's a completely reasonable starting point for research software — you build on what you have. But it meant that if you were on Windows, on a Linux server, or on an NVIDIA GPU, you simply couldn't use it. You'd clone the repo, try to build, and hit a wall of platform-specific compilation errors.

Beyond the platform issue, it was command-line only. That's fine for researchers comfortable with Rust and terminal tools. It's a genuine barrier for everyone else — students, people exploring from an ML background without systems experience, collaborators you want to share something with quickly. What I added

I spent a day on this and contributed three things to a public fork:

  1. Cross-platform builds

The original used Apple Accelerate (AMX) for BLAS unconditionally, which broke any non-Mac build immediately. I made BLAS selection platform-conditional: Apple Accelerate on macOS, OpenBLAS on Linux, ndarray on Windows. I fixed MSVC compiler compatibility in the C quantisation kernel (a single __builtin_memcpy call that GCC accepts but MSVC doesn't), fixed Cargo.toml scoping issues that hid dependencies from Windows builds, and fixed the HuggingFace model resolution path (Windows doesn't have $HOME).

  1. NVIDIA CUDA backend

Added a CudaBackend using cudarc and cuBLAS for GPU-accelerated extraction and queries on NVIDIA hardware. I verified it produces numerically identical output to the CPU path across multiple models and prompts.

  1. A Gradio web interface, deployed as a HuggingFace Space

This is the part I'm most happy about in terms of accessibility. I built a six-tab browser UI:

Tab What it does Walk Explorer Enter any prompt, choose layer range and top-K, get a table of active features with gate scores, "hears" tokens, and output predictions Knowledge Probe Compare three prompts side by side at a single layer LQL Console Free-form LQL with example queries pre-loaded Vindex Info Model metadata and SHA256 checksum verification Extract Download and extract any HuggingFace model directly from the browser Setup & About Build instructions and a quick LQL reference

The Space comes preloaded with a Qwen2.5-0.5B-Instruct vindex so you can explore immediately without extracting anything yourself.

Try it now, no account needed: https://huggingface.co/spaces/cronos3k/LARQL-Explorer What this is not

I want to be precise about credit. The intellectual work here — the LQL language design, the vindex format, the extraction pipeline, the inference engine, the core concept of treating model weights as a queryable graph — is entirely Chris Hayuk's. My fork exists to make that work run on more hardware and reach more people. I haven't changed the language. I haven't redesigned the format. I've removed friction between the tool and the people who might benefit from it.

If you use the Space and find it useful, the person to thank is Chris. His repo is at https://github.com/chrishayuk/larql. Who should care about this

If you work in any of these areas, I'd encourage you to spend fifteen minutes with the Space:

Mechanistic interpretability — this is one of the few tools that lets you directly inspect and edit feature activations without re-training

Model editing and knowledge patching — the INSERT/UPDATE operations are genuinely novel

AI safety — direct structured access to what a model "believes" is exactly the kind of primitive that safety research needs more of

Anyone curious — the Walk Explorer alone, watching which features light up as you change a prompt word by word, is worth the five minutes it takes to try

My fork (Windows / Linux / CUDA + Gradio UI): https://github.com/cronos3k/larql Chris Hayuk's original work: https://github.com/chrishayuk/larql Live HuggingFace Space: https://huggingface.co/spaces/cronos3k/LARQL-Explorer

Community

Sign up or log in to comment