Initial release: ormprotocol-causal-lasttoken-s42

54a07de verified 25 days ago

465 Bytes

license: mit
base_model: Dream-org/Dream-v0-Instruct-7B
tags:
  - process-reward-model
  - discrete-diffusion
  - gsm8k
  - lora
library_name: peft

ormprotocol-causal-lasttoken

ORM-protocol Causal LoRA with last-token pooling (seed 42). Trained on final states only (no step embedding, 8407 steps). Final accuracy = 0.842 at mask=0. Decision-tree Outcome B evidence: confirms architectural effect persists when training protocol is matched with the bidir ORM.