Jingwen Gu

I am a senior undergraduate student at Cornell University. My research ranges over RLHF, RLVR, and robotics. I am fortunate enough to have worked with Prof. Wen Sun, Prof. Abhishek Gupta, and Prof. Timur Dogan. My ultimate research objective is to develop reinforcement learning paradigms that enable agents to think, feel, and act in interesting ways.

Email   |   Google Scholar   |   GitHub   |   CV

Jingwen Gu Portrait

News

  • September 2025 Paper ReasonFlux-PRM was accepted to NeurIPS 2025!
  • August 2025 Presented VHM at Building Simulation 2025!
  • June 2025 Paper on CoT Self-Correction was accepted to the PUT workshop at ICML 2025!
  • June 2025 Started my research internship at WEIRD Lab, University of Washington, advised by Prof. Abhishek Gupta!
  • May 2025 Paper VHM was accepted to Building Simulation 2025!

Publications

ReasonFlux-PRM
ReasonFlux-PRM: Trajectory-Aware PRMs for Long Chain-of-Thought Reasoning in LLMs
Jiaru Zou*, Ling Yang*, Jingwen Gu* (equal contribution), Jiahao Qiu, Ke Shen, Jingrui He, Mengdi Wang
NeurIPS 2025

[Paper] | [Code]

Self-Correction
Learning to Self-Correct through Chain-of-Thought Verification
Bradley Guo, Jingwen Gu, Jin Peng Zhou, Wen Sun
ICML 2025, 2nd Workshop on Test-Time Adaptation: Putting Updates to the Test (PUT)

[Paper]

Orchestration
Orchestrating LLMs with Different Personalizations
Jin Peng Zhou, Katie Z Luo, Jingwen Gu, Jason Yuan, Kilian Q. Weinberger, Wen Sun
arXiv preprint, 2024

[Paper]

Virtual Horizon
Virtual Horizon Method: Fast shading calculations for UBEM using lidar data rasterization
Jingwen Gu, Timur Dogan
Building Simulation, 2025

[Paper]

Projects

KV-Cache Management with Reinforcement Learning
Course project for CS4756: Robot Learning, advised by Prof.Sanjiban Choudhury. Devised a method that trains an RL policy to intelligently compress the KV-cache of a transformer LLM during inference, enabling near-constant space usage for LLM deployment.

[Report] | [Code]