T2I Models
updated
yandex/stable-diffusion-3.5-medium-alchemist
Text-to-Image
•
Updated
•
2
•
6
Paper
•
2506.23044
•
Published
•
61
FreeMorph: Tuning-Free Generalized Image Morphing with Diffusion Model
Paper
•
2507.01953
•
Published
•
18
LongAnimation: Long Animation Generation with Dynamic Global-Local
Memory
Paper
•
2507.01945
•
Published
•
76
4KAgent: Agentic Any Image to 4K Super-Resolution
Paper
•
2507.07105
•
Published
•
105
T-LoRA: Single Image Diffusion Model Customization Without Overfitting
Paper
•
2507.05964
•
Published
•
119
LangSplatV2: High-dimensional 3D Language Gaussian Splatting with 450+ FPS
Paper
•
2507.07136
•
Published
•
39
NoHumansRequired: Autonomous High-Quality Image Editing Triplet Mining
Paper
•
2507.14119
•
Published
•
58
DesignLab: Designing Slides Through Iterative Detection and Correction
Paper
•
2507.17202
•
Published
•
50
PUSA V1.0: Surpassing Wan-I2V with $500 Training Cost by Vectorized
Timestep Adaptation
Paper
•
2507.16116
•
Published
•
11
ARC-Hunyuan-Video-7B: Structured Video Comprehension of Real-World
Shorts
Paper
•
2507.20939
•
Published
•
56
X-Omni: Reinforcement Learning Makes Discrete Autoregressive Image
Generative Models Great Again
Paper
•
2507.22058
•
Published
•
39
Qwen-Image Technical Report
Paper
•
2508.02324
•
Published
•
267
Omni-Effects: Unified and Spatially-Controllable Visual Effects
Generation
Paper
•
2508.07981
•
Published
•
58
NextStep-1: Toward Autoregressive Image Generation with Continuous
Tokens at Scale
Paper
•
2508.10711
•
Published
•
145
Pref-GRPO: Pairwise Preference Reward-based GRPO for Stable
Text-to-Image Reinforcement Learning
Paper
•
2508.20751
•
Published
•
89
Emu3.5: Native Multimodal Models are World Learners
Paper
•
2510.26583
•
Published
•
108
OmniLayout: Enabling Coarse-to-Fine Learning with LLMs for Universal
Document Layout Generation
Paper
•
2510.26213
•
Published
•
9
Multimodal Spatial Reasoning in the Large Model Era: A Survey and
Benchmarks
Paper
•
2510.25760
•
Published
•
16
One Small Step in Latent, One Giant Leap for Pixels: Fast Latent Upscale Adapter for Your Diffusion Models
Paper
•
2511.10629
•
Published
•
123
Kandinsky 5.0: A Family of Foundation Models for Image and Video Generation
Paper
•
2511.14993
•
Published
•
227
Back to Basics: Let Denoising Generative Models Denoise
Paper
•
2511.13720
•
Published
•
67
Light-X: Generative 4D Video Rendering with Camera and Illumination Control
Paper
•
2512.05115
•
Published
•
10
Wan-Move: Motion-controllable Video Generation via Latent Trajectory Guidance
Paper
•
2512.08765
•
Published
•
128
Preserving Source Video Realism: High-Fidelity Face Swapping for Cinematic Quality
Paper
•
2512.07951
•
Published
•
48
EgoEdit: Dataset, Real-Time Streaming Model, and Benchmark for Egocentric Video Editing
Paper
•
2512.06065
•
Published
•
28
Towards Scalable Pre-training of Visual Tokenizers for Generation
Paper
•
2512.13687
•
Published
•
98
Few-Step Distillation for Text-to-Image Generation: A Practical Guide
Paper
•
2512.13006
•
Published
•
7
EgoX: Egocentric Video Generation from a Single Exocentric Video
Paper
•
2512.08269
•
Published
•
115