UniWorld: High-Resolution Semantic Encoders for Unified Visual Understanding and Generation Paper • 2506.03147 • Published Jun 3, 2025 • 58
Flow-GRPO: Training Flow Matching Models via Online RL Paper • 2505.05470 • Published May 8, 2025 • 86
In-Context Edit: Enabling Instructional Image Editing with In-Context Generation in Large Scale Diffusion Transformer Paper • 2504.20690 • Published Apr 29, 2025 • 19
Unconditional Priors Matter! Improving Conditional Generation of Fine-Tuned Diffusion Models Paper • 2503.20240 • Published Mar 26, 2025 • 22
Dita: Scaling Diffusion Transformer for Generalist Vision-Language-Action Policy Paper • 2503.19757 • Published Mar 25, 2025 • 51
CFG-Zero*: Improved Classifier-Free Guidance for Flow Matching Models Paper • 2503.18886 • Published Mar 24, 2025 • 24
Position: Interactive Generative Video as Next-Generation Game Engine Paper • 2503.17359 • Published Mar 21, 2025 • 61
I Have Covered All the Bases Here: Interpreting Reasoning Features in Large Language Models via Sparse Autoencoders Paper • 2503.18878 • Published Mar 24, 2025 • 119
Rewards Are Enough for Fast Photo-Realistic Text-to-image Generation Paper • 2503.13070 • Published Mar 17, 2025 • 10
Concat-ID: Towards Universal Identity-Preserving Video Synthesis Paper • 2503.14151 • Published Mar 18, 2025 • 10
Cosmos-Transfer1: Conditional World Generation with Adaptive Multimodal Control Paper • 2503.14492 • Published Mar 18, 2025 • 20
EasyControl: Adding Efficient and Flexible Control for Diffusion Transformer Paper • 2503.07027 • Published Mar 10, 2025 • 29