PACEvolve: Enabling Long-Horizon Progress-Aware Consistent Evolution Paper • 2601.10657 • Published 3 days ago • 16
Hybrid Reinforcement: When Reward Is Sparse, It's Better to Be Dense Paper • 2510.07242 • Published Oct 8, 2025 • 30