STOP-project
/

STOP-1.5B

Model card Files Files and versions

STOP-1.5B / README.md

Jiaxi0775's picture

Update README.md

cef443c verified about 1 month ago

|

history blame contribute delete

989 Bytes

	# STOP-1.5B: Early Path Pruning Module

	This repository contains the STOP module trained for prefix-level path pruning on top of a 1.5B reasoning model.

	## Overview

	STOP (Super TOken for Pruning) is a lightweight module that predicts whether a reasoning prefix is promising, enabling early pruning of unproductive paths.

	It operates by:

	- Appending a special `[STOP]` token
	- Reading internal KV-cache states
	- Producing a scalar quality score

	## Architecture

	- Base model: frozen reasoning model (1.5B)
	- Adapter: LoRA-based critique module
	- Head: lightweight classifier

	## Training

	The model is trained using prefix–potential supervision constructed via Monte Carlo rollouts.

	## Usage

	After generating prefixes, STOP can be used to:

	1. Score each prefix
	2. Select top-k candidates
	3. Resume generation only on selected paths

	## Results

	- Significant token reduction (up to 70%)
	- Improved reasoning accuracy
	- Strong performance in tool-use settings (AIMO3)

	## Citation