InfiniteVGGT: Visual Geometry Grounded Transformer for Endless Streams Paper • 2601.02281 • Published 19 days ago • 33
RoMa v2: Harder Better Faster Denser Feature Matching Paper • 2511.15706 • Published Nov 19, 2025 • 8
Emu3.5 Collection Native Multimodal Models are World Learners 🌍 • 4 items • Updated about 1 month ago • 73
EarthCrafter: Scalable 3D Earth Generation via Dual-Sparse Latent Diffusion Paper • 2507.16535 • Published Jul 22, 2025 • 21
Probing the 3D Awareness of Visual Foundation Models Paper • 2404.08636 • Published Apr 12, 2024 • 14
VideoChat-R1 Collection VideoChat-R1: Enhancing Spatio-Temporal Perception via Reinforcement Fine-Tuning • 4 items • Updated Sep 28, 2025 • 8
Cosmos-Tokenizer Collection A suite of image and video tokenizers • 13 items • Updated 4 days ago • 43
BetterDepth: Plug-and-Play Diffusion Refiner for Zero-Shot Monocular Depth Estimation Paper • 2407.17952 • Published Jul 25, 2024 • 32