CUDA Agent: Large-Scale Agentic RL for High-Performance CUDA Kernel Generation Paper • 2602.24286 • Published 5 days ago • 70
Mol-R1: Towards Explicit Long-CoT Reasoning in Molecule Discovery Paper • 2508.08401 • Published Aug 11, 2025 • 42
Seed1.5-Thinking: Advancing Superb Reasoning Models with Reinforcement Learning Paper • 2504.13914 • Published Apr 10, 2025 • 5
A Minimalist Approach to LLM Reasoning: from Rejection Sampling to Reinforce Paper • 2504.11343 • Published Apr 15, 2025 • 20 • 6
DAPO: An Open-Source LLM Reinforcement Learning System at Scale Paper • 2503.14476 • Published Mar 18, 2025 • 144