InftyThink+: Effective and Efficient Infinite-Horizon Reasoning via Reinforcement Learning
arXiv:2602.06960v2 Announce Type: replace-cross Abstract: Large reasoning models achieve strong performance by scaling inference-time chain-of-thought, but this paradigm suffers from…
