File size: 3,995 Bytes
# Codex Dev Notes

## Shinka Search And Memory Redesign

This update changed the Shinka-side evolution loop to be less parent-anchored and more deliberate about search strategy.

### 1. Structured universal memory

Added `shinka/core/universal_memory.py`.

The new memory layer records one structured entry per evaluated program, including:
- generation
- program id and parent id
- search mode
- patch type
- patch name
- prompt intent
- correctness
- combined score
- parent score and score delta
- failure mode
- verdict (`win`, `loss`, `neutral`, `invalid`)
- summarized `aux_*` metrics
- tags

The memory is persisted at:

`<results_dir>/universal_memory.json`

It also maintains aggregated strategy statistics so future prompts can see:
- recent wins
- recent failures
- mode-specific outcomes
- underexplored strategy families

### 2. Adaptive search-mode controller

Added `shinka/core/search_mode.py`.

The controller chooses a search mode per generation:
- `refine`
- `recombine`
- `diverge`
- `restart`
- `theory`

The choice is currently heuristic and uses:
- recent best-score change
- recent invalid-rate
- recent mode repetition
- whether some modes are underexplored

Current behavior:
- high invalid-rate pushes toward `restart`
- plateau after repeated `refine` pushes toward `diverge`
- periodic generations can trigger `theory`
- underexplored or scheduled turns can trigger `recombine`

### 3. Runner integration

Updated `shinka/core/runner.py`.

Main integration points:
- initialize `UniversalMemory`
- initialize `SearchModeController`
- choose a search mode before each generation is patched
- adapt parent/inspiration usage based on search mode
- inject search metadata into patch metadata
- write evaluated outcomes back into structured memory

Important effects:
- generation 0 is recorded into universal memory
- completed jobs now store `search_mode`, `search_rationale`, and `prompt_intent`
- post-evaluation writes structured results back into universal memory
- `restart` mode can fall back to a generation-0 style parent and clear inspirations
- `diverge`, `recombine`, and `theory` trim or reshape inspiration context

### 4. Prompt changes

Updated `shinka/core/sampler.py`.

Prompt construction now supports:
- `search_mode`
- `search_rationale`
- `prompt_intent`
- `memory_context`

New prompt sections:
- `# Search Mode`
- `# Universal Memory`

Mode-specific behavior:
- `restart` strongly prefers `full`
- `theory` biases toward `full` or `cross`
- `diverge` biases toward larger jumps
- `recombine` biases toward crossover
- context is reduced or cleared for modes that should not overfit to the current lineage

This is intended to reduce the previous tendency to always optimize as a local descendant of the current parent.

### 5. Package exports

Updated `shinka/core/__init__.py` to export:
- `UniversalMemory`
- `SearchModeController`
- `SearchModeDecision`

### 6. Verification

Performed:
- Python syntax check with `python3 -m py_compile` on the modified core files

Not yet performed:
- full end-to-end experiment validation
- behavioral verification of mode switching during long runs

### 7. Expected runtime artifacts

After running an experiment, expect:
- `universal_memory.json` in the experiment root
- search metadata in program `metadata`
- prompt traces showing explicit search modes
- more varied search behavior across generations

### 8. Known limitations of this first pass

This is a first implementation, not the final architecture.

Current limitations:
- search-mode control is heuristic, not learned
- universal memory retrieval is still recent-history oriented
- memory does not yet cluster strategy families semantically
- prompt memory is summarized text, not selective structured retrieval
- no dedicated controller yet for allocating a fixed exploration budget by mode

The current goal was to create the first working foundation for:
- structured experiment memory
- explicit mode switching
- reduced parent anchoring