Yume-1.5: A Text-Controlled Interactive World Generation Model Paper • 2512.22096 • Published 10 days ago • 57
Yume-1.5: A Text-Controlled Interactive World Generation Model Paper • 2512.22096 • Published 10 days ago • 57
SVBench: Evaluation of Video Generation Models on Social Reasoning Paper • 2512.21507 • Published 12 days ago • 7
π^3: Scalable Permutation-Equivariant Visual Geometry Learning Paper • 2507.13347 • Published Jul 17, 2025 • 65
Sekai: A Video Dataset towards World Exploration Paper • 2506.15675 • Published Jun 18, 2025 • 65 • 2
A High-Quality Dataset and Reliable Evaluation for Interleaved Image-Text Generation Paper • 2506.09427 • Published Jun 11, 2025 • 8
A High-Quality Dataset and Reliable Evaluation for Interleaved Image-Text Generation Paper • 2506.09427 • Published Jun 11, 2025 • 8 • 2
Lumina-Next: Making Lumina-T2X Stronger and Faster with Next-DiT Paper • 2406.18583 • Published Jun 5, 2024
EfficientQAT: Efficient Quantization-Aware Training for Large Language Models Paper • 2407.11062 • Published Jul 10, 2024 • 10
Dynamic Multimodal Evaluation with Flexible Complexity by Vision-Language Bootstrapping Paper • 2410.08695 • Published Oct 11, 2024
ZipAR: Accelerating Autoregressive Image Generation through Spatial Locality Paper • 2412.04062 • Published Dec 5, 2024 • 8
LiT: Delving into a Simplified Linear Diffusion Transformer for Image Generation Paper • 2501.12976 • Published Jan 22, 2025