TS-Proof-Ranker-v2
A local tension detector for first-order Horn proof chains. Given
(axioms, candidate_chain), it emits one tension score per chain step and a
sequence-level "any failure" score.
This is the v2 layer above:
The model is a compact BiGRU token tagger trained on synthetic chains where one canonical proof step is replaced with a non-derivable atom. The symbolic verifier supplies the ground-truth first failure position.
Architecture
- Embedding(vocab=39, d=32)
- BiGRU(input=32, hidden=32, 1 layer, dropout=0.1)
- Token MLP head trained at chain-step start positions
- Mean+max pooled MLP head for "any failure"
- 20194 parameters
Test results
{
"n_examples": 1000,
"step_acc": 0.6490479317137229,
"any_acc": 0.673,
"fail_pos_acc": 0.46,
"by_kind_any_acc": {
"clean": 0.786,
"corrupted_step": 0.56
},
"by_depth_any_acc": {
"1": 0.532608695652174,
"2": 0.7164948453608248,
"3": 0.7376237623762376,
"4": 0.6637931034482759,
"5": 0.7074468085106383
}
}
Training
{
"task": "synthetic_first_order_horn_failed_step_detection",
"n_parameters": 20194,
"history_summary": [
{
"epoch": 1,
"val_step_acc": 0.605,
"val_any_acc": 0.502,
"val_fail_pos_acc": 0.478,
"val_loss": 1.2268
},
{
"epoch": 2,
"val_step_acc": 0.6155,
"val_any_acc": 0.5,
"val_fail_pos_acc": 0.478,
"val_loss": 1.1931
},
{
"epoch": 3,
"val_step_acc": 0.6201,
"val_any_acc": 0.503,
"val_fail_pos_acc": 0.478,
"val_loss": 1.1916
},
{
"epoch": 4,
"val_step_acc": 0.6102,
"val_any_acc": 0.567,
"val_fail_pos_acc": 0.478,
"val_loss": 1.1908
},
{
"epoch": 5,
"val_step_acc": 0.6342,
"val_any_acc": 0.544,
"val_fail_pos_acc": 0.478,
"val_loss": 1.1864
},
{
"epoch": 6,
"val_step_acc": 0.628,
"val_any_acc": 0.581,
"val_fail_pos_acc": 0.478,
"val_loss": 1.1827
},
{
"epoch": 7,
"val_step_acc": 0.6122,
"val_any_acc": 0.581,
"val_fail_pos_acc": 0.478,
"val_loss": 1.1781
},
{
"epoch": 8,
"val_step_acc": 0.6444,
"val_any_acc": 0.645,
"val_fail_pos_acc": 0.478,
"val_loss": 1.1665
}
]
}
Input format
v2 reuses v1's tokenizer and omits the goal:
<axiom> ax1 <axiom> ax2 ... <chain> step1 <step> step2 ...
See ts_proof/env/serialize.py for the canonical serializer and step-position
mapping.
- Downloads last month
- 19