Let It Flow: Agentic Crafting on Rock and Roll, Building the ROME Model within an Open Agentic Learning Ecosystem Paper • 2512.24873 • Published 5 days ago • 55
Attention Illuminates LLM Reasoning: The Preplan-and-Anchor Rhythm Enables Fine-Grained Policy Optimization Paper • 2510.13554 • Published Oct 15, 2025 • 57
Part I: Tricks or Traps? A Deep Dive into RL for LLM Reasoning Paper • 2508.08221 • Published Aug 11, 2025 • 50
Reinforcement Learning Optimization for Large-Scale Learning: An Efficient and User-Friendly Scaling Library Paper • 2506.06122 • Published Jun 6, 2025 • 7