Nested Learning: The Illusion of Deep Learning Architectures Paper • 2512.24695 • Published Dec 31, 2025 • 42
Yume-1.5: A Text-Controlled Interactive World Generation Model Paper • 2512.22096 • Published Dec 26, 2025 • 60
TurboDiffusion: Accelerating Video Diffusion Models by 100-200 Times Paper • 2512.16093 • Published Dec 18, 2025 • 95
RealGen: Photorealistic Text-to-Image Generation via Detector-Guided Rewards Paper • 2512.00473 • Published Nov 29, 2025 • 26
Z-Image: An Efficient Image Generation Foundation Model with Single-Stream Diffusion Transformer Paper • 2511.22699 • Published Nov 27, 2025 • 236
One Layer Is Enough: Adapting Pretrained Visual Encoders for Image Generation Paper • 2512.07829 • Published Dec 8, 2025 • 22
Diffusion Transformers with Representation Autoencoders Paper • 2510.11690 • Published Oct 13, 2025 • 166
Reverse-Engineered Reasoning for Open-Ended Generation Paper • 2509.06160 • Published Sep 7, 2025 • 149
NextStep-1: Toward Autoregressive Image Generation with Continuous Tokens at Scale Paper • 2508.10711 • Published Aug 14, 2025 • 145
Beyond Fixed: Variable-Length Denoising for Diffusion Large Language Models Paper • 2508.00819 • Published Aug 1, 2025 • 63
X-Omni: Reinforcement Learning Makes Discrete Autoregressive Image Generative Models Great Again Paper • 2507.22058 • Published Jul 29, 2025 • 40
"PhyWorldBench": A Comprehensive Evaluation of Physical Realism in Text-to-Video Models Paper • 2507.13428 • Published Jul 17, 2025 • 16
Running on Zero MCP Featured 1.48k LTX Video Fast 🎥 1.48k ultra-fast video model, LTX 0.9.8 13B distilled