**Bytecode Layout**

Current layout: `fixed5x32-v1` uses 5 `i32` words per instruction.

Observed facts from `data/cards_compiled.json`:

- Instructions analyzed: 5,172
- JSON database size: 11,222,487 bytes
- Binary snapshot size: 3,139,615 bytes
- `V` is zero in 36.49% of instructions
- `A_LOW` is zero in 79.39% of instructions
- `A_HIGH` is zero in 79.93% of instructions
- `S` is zero in 45.32% of instructions
- Only 20.07% of instructions need a non-zero high 32 bits of `A`
- Rough zero-bit rate across raw 32-bit words: about 96.75%
- Unique full instructions: 599 out of 5,173, about 11.58%

Instruction shape distribution:

- `op_only`: 1,554 instructions, 30.05%
- `op_v`: 637 instructions, 12.32%
- `op_v_a`: 113 instructions, 2.18%
- `op_v_s`: 1,130 instructions, 21.85%
- `full`: 1,738 instructions, 33.60%

Rough storage estimate:

- Fixed-width words: 25,860
- Compact optional-field words: 13,982
- Potential reduction: 11,878 words, 45.93%

**What This Means**

The fixed-width VM is materially wasteful, but the pain is not just storage. The real coupling is execution math:

- The Rust engine assumes `ip += 5`
- Jumps are encoded in instruction-count units derived from 5-word chunks
- Many helpers use `chunks(5)` directly
- Some higher-level optimizations depend on bytecode offsets lining up with effect indices

So the bytecode is still important, but mostly as a stable execution contract. Its current shape is more of a transport convenience than a semantic necessity.

The zero-bit number is useful, but it overstates the practical gain if read literally. Many instructions are simple and repeated:

- `RETURN` appears everywhere
- common jumps and simple draw/buff effects recur often
- a lot of non-zero fields still only use a handful of bits

So there are really three different optimization levers:

1. Field compaction: stop storing absent `V/A/S` words
2. Narrower payload encoding: stop paying full 32 bits for tiny values
3. Dictionary or template encoding: reuse repeated instructions or repeated short sequences

Field compaction is the safest first step because it preserves the current instruction model.
Template encoding is likely the next-biggest size win because only about 11.6% of full instructions are unique.

**Best Next Step**

Do not jump directly from `fixed5x32-v1` to a fully irregular format.

Safer staged plan:

1. Introduce `BytecodeProgram` and centralize decoding behind an iterator/API.
2. Keep the in-memory interpreter interface stable while abstracting away `chunks(5)`.
3. Add a new layout version that uses a compact header per instruction plus optional payload words.
4. Convert jumps from `ip + 5 + v * 5` math to decoded instruction indices.
5. Keep compiler support for both layouts during migration.

**Recommended Compact Layout**

Use tagged instructions instead of raw 5-word tuples:

- Header word:
  - opcode
  - flags for `has_v`, `has_a`, `has_s`, `wide_a`
- Followed by only the present payload words

Examples:

- `RETURN` -> `[header]`
- `DRAW 1` -> `[header, v]`
- `SET_TAPPED true @slot` -> `[header, v, s]`
- Filtered effect -> `[header, v, a_low]`
- Wide filter effect -> `[header, v, a_low, a_high, s]`

That keeps decoding simple while removing most zero padding.

**Priority Order**

If the goal is maximum practical win with minimum risk:

1. Prune compiled JSON fields
2. Keep using `cards_compiled.bin`
3. Centralize bytecode access behind an abstraction
4. Only then migrate layout

If the goal is readability and debuggability:

1. Push more common patterns into generated Rust
2. Use bytecode as fallback for uncommon/complex patterns
3. Treat compact bytecode as a second-stage optimization