Instructions to use programasweights/paw-4b-qwen3-0.6b with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use programasweights/paw-4b-qwen3-0.6b with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="programasweights/paw-4b-qwen3-0.6b")

# Load model directly
from transformers import AutoModel
model = AutoModel.from_pretrained("programasweights/paw-4b-qwen3-0.6b", dtype="auto")

Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use programasweights/paw-4b-qwen3-0.6b with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "programasweights/paw-4b-qwen3-0.6b"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "programasweights/paw-4b-qwen3-0.6b",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker

docker model run hf.co/programasweights/paw-4b-qwen3-0.6b

SGLang

How to use programasweights/paw-4b-qwen3-0.6b with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "programasweights/paw-4b-qwen3-0.6b" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "programasweights/paw-4b-qwen3-0.6b",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "programasweights/paw-4b-qwen3-0.6b" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "programasweights/paw-4b-qwen3-0.6b",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Docker Model Runner
How to use programasweights/paw-4b-qwen3-0.6b with Docker Model Runner:
```
docker model run hf.co/programasweights/paw-4b-qwen3-0.6b
```

paw-4b-qwen3-0.6b — ProgramAsWeights "Standard" compiler

This is the Standard compiler from ProgramAsWeights (PAW). Given a natural-language spec (a description of a function), it emits a tiny per-task program — a LoRA adapter — that then runs locally on the Qwen3-0.6B interpreter.

It is the model invoked by paw.compile(spec, compiler="paw-4b-qwen3-0.6b") and powers programs on https://programasweights.com.

Compiler base model: Qwen/Qwen3-4B-Instruct-2507
Target interpreter: Qwen/Qwen3-0.6B
Snapshot: 20260407 (see git tag 20260407)

compiler/ — a finetuned Qwen3-4B-Instruct-2507 causal LM (the compiler).
lora_mapper.pt — the mapper head (trunk + coefficient head + learnable LoRA basis matrices) that turns the compiler's hidden states into a LoRA program.
meta.json — lora_rank=64, lora_alpha=16, lora_num_bases=64, prefix_steps=64, target modules [q,k,v,o,gate,up,down]_proj.

How it works

The 4B compiler generates a short "pseudo-program" (a task description plus a few I/O examples) from the spec.
The text chat_template(spec) + pseudo-program + 64 prefix tokens is run through the compiler; the mapper reads the 64 prefix hidden states and emits per-layer LoRA A/B matrices as a learned mixture of basis matrices.
The resulting LoRA (about 22 MB) is the program. It loads onto Qwen3-0.6B and runs locally and offline (about 100 ms/call).

Status

Inference/runtime SDK (load + run a compiled program locally): https://github.com/programasweights/programasweights-python
The cleaned compile/runtime code and the arXiv preprint ("Program-as-Weights: A Programming Paradigm for Fuzzy Functions", AIware 2026) will be public by Jul 6, 2026. An uncleaned reference snapshot is at https://anonymous.4open.science/r/programasweights
Live demo + program hub: https://programasweights.com

Downloads last month: -; Downloads are not tracked for this model. How to track

Model tree for programasweights/paw-4b-qwen3-0.6b

Base model

Qwen/Qwen3-4B-Instruct-2507

Adapter

(5498)

this model

programasweights
/

paw-4b-qwen3-0.6b

paw-4b-qwen3-0.6b — ProgramAsWeights "Standard" compiler

Contents

How it works

Status

Model tree for programasweights/paw-4b-qwen3-0.6b