FrontierCS: Evolving Challenges for Evolving Intelligence Paper • 2512.15699 • Published 12 days ago • 5
Feedforward 3D Editing via Text-Steerable Image-to-3D Paper • 2512.13678 • Published 14 days ago • 13
Boosting Medical Visual Understanding From Multi-Granular Language Learning Paper • 2511.15943 • Published Nov 20 • 1
VisionUnite: A Vision-Language Foundation Model for Ophthalmology Enhanced with Clinical Knowledge Paper • 2408.02865 • Published Aug 5, 2024 • 2
MonoTAKD: Teaching Assistant Knowledge Distillation for Monocular 3D Object Detection Paper • 2404.04910 • Published Apr 7, 2024
DiffPO: Diffusion-styled Preference Optimization for Efficient Inference-Time Alignment of Large Language Models Paper • 2503.04240 • Published Mar 6
Science-T2I: Addressing Scientific Illusions in Image Synthesis Paper • 2504.13129 • Published Apr 17 • 3
Video-MMLU: A Massive Multi-Discipline Lecture Understanding Benchmark Paper • 2504.14693 • Published Apr 20
EMMOE: A Comprehensive Benchmark for Embodied Mobile Manipulation in Open Environments Paper • 2503.08604 • Published Mar 11
Muddit: Liberating Generation Beyond Text-to-Image with a Unified Discrete Diffusion Model Paper • 2505.23606 • Published May 29 • 14
LiveCodeBench Pro: How Do Olympiad Medalists Judge LLMs in Competitive Programming? Paper • 2506.11928 • Published Jun 13 • 24
TEMPURA: Temporal Event Masked Prediction and Understanding for Reasoning in Action Paper • 2505.01583 • Published May 2 • 8
OLMoTrace: Tracing Language Model Outputs Back to Trillions of Training Tokens Paper • 2504.07096 • Published Apr 9 • 76