GenPRM: Scaling Test-Time Compute of Process Reward Models via Generative Reasoning
Published in arXiv preprint, 2025
We propose GenPRM, a method for scaling test-time compute of process reward models via generative reasoning.
Recommended citation: Jian Zhao, Runze Liu, Kaiyan Zhang, Zhimu Zhou, Junqi Gao, Dong Li, Jiafei Lyu, Zhouyi Qian, Biqing Qi, Xiu Li, Bowen Zhou. (2025). "GenPRM: Scaling Test-Time Compute of Process Reward Models via Generative Reasoning." arXiv preprint arXiv:2505.15825.
Download Paper
