The Flexibility Trap: Why Arbitrary Order Limits Reasoning Potential in Diffusion Language Models Paper • 2601.15165 • Published 1 day ago • 42
LLM-in-Sandbox Elicits General Agentic Intelligence Paper • 2601.16206 • Published about 14 hours ago • 32
SeerAttention-R: Sparse Attention Adaptation for Long Reasoning Paper • 2506.08889 • Published Jun 10, 2025 • 23
Black-Box On-Policy Distillation of Large Language Models Paper • 2511.10643 • Published Nov 13, 2025 • 51
Black-Box On-Policy Distillation of Large Language Models Paper • 2511.10643 • Published Nov 13, 2025 • 51
AdaSPEC: Selective Knowledge Distillation for Efficient Speculative Decoders Paper • 2510.19779 • Published Oct 22, 2025 • 61