dd101bb
/

latentRM

Feature Extraction

token-classification

text-generation-inference

Model card Files Files and versions

latentRM / README.md

nielsr's picture

nielsr HF Staff

Update pipeline tag and add license

fa08fbb verified 3 months ago

|

1.1 kB

	---
	base_model:
	- openai-community/gpt2
	datasets:
	- openai/gsm8k
	library_name: transformers
	pipeline_tag: feature-extraction
	tags:
	- rm
	- latent
	license: apache-2.0
	---

	# LatentRM

	The Latent Reward Model (LatentRM) is a learned scorer designed for latent reasoning models that reason in continuous hidden space.
	LatentRM provides the missing aggregation signal for parallel test-time scaling in latent models, enabling techniques such as best-of-N and beam search without explicit token-level probabilities.

	<p align="center">
	<a href="https://arxiv.org/pdf/2510.07745"><b>Paper Link</b>👁️</a>
	</p>

	<p align="center">
	<a href="https://github.com/YRYangang/LatentTTS"><b>GitHub Repo</b>🐙</a>
	</p>


	## Citation
	```
	@misc{you2025paralleltesttimescalinglatent,
	title={Parallel Test-Time Scaling for Latent Reasoning Models},
	author={Runyang You and Yongqi Li and Meng Liu and Wenjie Wang and Liqiang Nie and Wenjie Li},
	year={2025},
	eprint={2510.07745},
	archivePrefix={arXiv},
	primaryClass={cs.CL},
	url={https://arxiv.org/abs/2510.07745},
	}
	```