paw-4b-gpt2 — ProgramAsWeights "Compact" compiler

This is the Compact compiler from ProgramAsWeights (PAW). Given a natural-language spec, it emits a tiny per-task program — a LoRA adapter — that runs locally on a GPT-2 (124M) interpreter (small enough to run in the browser).

It is the model invoked by paw.compile(spec, compiler="paw-4b-gpt2").

Compiler base model: Qwen/Qwen3-4B-Instruct-2507
Target interpreter: a custom GPT-2 (124M) whose positional embeddings are extended from 1024 → 2048 (n_ctx=2048); tokenizer is stock GPT-2 BPE.
Snapshot: 20260406 (see git tag 20260406)

compiler/ — a finetuned Qwen3-4B-Instruct-2507 causal LM (the compiler).
lora_mapper.pt — the mapper head (trunk + coefficient head + learnable LoRA basis matrices) that turns the compiler's hidden states into a LoRA program.
meta.json — lora_rank=64, lora_alpha=16, lora_num_bases=64, prefix_steps=64, target modules [c_attn, c_proj, c_fc].

How it works

The 4B compiler generates a short "pseudo-program" (a task description plus a few I/O examples) from the spec.
The text chat_template(spec) + pseudo-program + 64 prefix tokens is run through the compiler; the mapper reads the 64 prefix hidden states and emits per-layer LoRA A/B matrices as a learned mixture of basis matrices.
The resulting LoRA (about 5 MB) is the program. It loads onto the GPT-2 interpreter and runs locally/offline (including in-browser).

Status

Inference/runtime SDK (load + run a compiled program locally): https://github.com/programasweights/programasweights-python (browser SDK: https://github.com/programasweights/programasweights-js)
The cleaned compile/runtime code and the arXiv preprint ("Program-as-Weights: A Programming Paradigm for Fuzzy Functions", AIware 2026) will be public by Jul 6, 2026. An uncleaned reference snapshot is at https://anonymous.4open.science/r/programasweights
Live demo + program hub: https://programasweights.com
Paper: arxiv.org/abs/2607.02512

Downloads last month: -; Downloads are not tracked for this model. How to track

Model tree for programasweights/paw-4b-gpt2

Base model

Qwen/Qwen3-4B-Instruct-2507

Adapter

(5652)

this model

Paper for programasweights/paw-4b-gpt2

Program-as-Weights: A Programming Paradigm for Fuzzy Functions

Paper • 2607.02512 • Published about 1 month ago • 238

programasweights
/

paw-4b-gpt2