SciAgentGym: Benchmarking Multi-Step Scientific Tool-use in LLM Agents Paper • 2602.12984 • Published 19 days ago • 5