HyperVL: An Efficient and Dynamic Multimodal Large Language Model for Edge Devices Paper • 2512.14052 • Published 24 days ago • 40
Layer-Aware Video Composition via Split-then-Merge Paper • 2511.20809 • Published Nov 25, 2025 • 2
UI-TARS-2 Technical Report: Advancing GUI Agent with Multi-Turn Reinforcement Learning Paper • 2509.02544 • Published Sep 2, 2025 • 124
Robix: A Unified Model for Robot Interaction, Reasoning and Planning Paper • 2509.01106 • Published Sep 1, 2025 • 51
Ming-Omni: A Unified Multimodal Model for Perception and Generation Paper • 2506.09344 • Published Jun 11, 2025 • 28