TS-Proof-Ranker-v4

A verifier-backed proof-loop orchestrator for synthetic first-order Horn proof search. v4 composes the earlier TS-Proof-Ranker layers into the intended proposal → ranking → verification → repair loop.

This sits above:

What v4 does

v4 is not a standalone neural checkpoint. It is the proof-search control loop:

expand the local one-step proof frontier from ground facts plus the current chain;
score partial proof beams with a verifier-free heuristic, optionally mixed with the frozen v1 ranker;
verify completed chains with the symbolic kernel;
localize tension on invalid chains;
fall back to v3 repair when search budget is insufficient.

The verifier remains the grounding source of truth.

Current results

Heuristic beam scoring:

{
  "n_examples": 1000,
  "ranker": "heuristic",
  "proof_success": 1.0,
  "search_success": 1.0,
  "repair_rate": 0.0,
  "avg_explored": 6.273,
  "by_depth_success": {
    "1": 1.0,
    "2": 1.0,
    "3": 1.0,
    "4": 1.0,
    "5": 1.0
  },
  "by_depth_search_success": {
    "1": 1.0,
    "2": 1.0,
    "3": 1.0,
    "4": 1.0,
    "5": 1.0
  }
}

v1-guided beam scoring:

{
  "n_examples": 1000,
  "ranker": "v1",
  "proof_success": 1.0,
  "search_success": 1.0,
  "repair_rate": 0.0,
  "avg_explored": 6.269,
  "by_depth_success": {
    "1": 1.0,
    "2": 1.0,
    "3": 1.0,
    "4": 1.0,
    "5": 1.0
  },
  "by_depth_search_success": {
    "1": 1.0,
    "2": 1.0,
    "3": 1.0,
    "4": 1.0,
    "5": 1.0
  }
}

Artifact contents

config.json — loop configuration and dependency repo IDs.
metrics.json — heuristic and v1-guided eval reports.
v4_orchestrator.py — active-frontier proof loop source.
v1_bridge.py — optional v1 ranker scoring bridge.

Usage

Use the source repo for runnable imports:

from ts_proof.env.generator import generate_problem
from ts_proof.loop.v4_orchestrator import prove

import random

rng = random.Random(23)
problem = generate_problem(rng, depth=4)
result = prove(problem.axioms, problem.goal)
print(result.success, result.chain)

To include the frozen v1 ranker as a beam scorer:

from ts_proof.loop.v1_bridge import make_v1_chain_scorer
from ts_proof.loop.v4_orchestrator import prove

scorer = make_v1_chain_scorer("checkpoints/v1/v1_best.pt")
result = prove(problem.axioms, problem.goal, scorer=scorer)

Limitations

The proposal generator is still symbolic active-frontier expansion, not a trained neural generator.
v2 and the neural v3 decoder are experimental and weak; v4 currently trusts the symbolic verifier and symbolic v3 repair baseline for grounding.
Evaluated only on synthetic one-argument Horn problems with proof depths 1-5.

Downloads last month: 3