AlirezaDoC 's Collections Papers
updated
Composition-RL: Compose Your Verifiable Prompts for Reinforcement Learning of Large Language Models
Paper
• 2602.12036
• Published
• 98
Reinforcement Learning for Self-Improving Agent with Skill Library
Paper
• 2512.17102
• Published
• 36
Diffusion Knows Transparency: Repurposing Video Diffusion for Transparent Object Depth and Normal Estimation
Paper
• 2512.23705
• Published
• 45
Schoenfeld's Anatomy of Mathematical Reasoning by Language Models
Paper
• 2512.19995
• Published
• 16
Dream-VL & Dream-VLA: Open Vision-Language and Vision-Language-Action Models with Diffusion Language Model Backbone
Paper
• 2512.22615
• Published
• 49
TimeBill: Time-Budgeted Inference for Large Language Models
Paper
• 2512.21859
• Published
• 25
Evaluating Parameter Efficient Methods for RLVR
Paper
• 2512.23165
• Published
• 28
Let It Flow: Agentic Crafting on Rock and Roll, Building the ROME Model within an Open Agentic Learning Ecosystem
Paper
• 2512.24873
• Published
• 105
Youtu-LLM: Unlocking the Native Agentic Potential for Lightweight Large Language Models
Paper
• 2512.24618
• Published
• 151
SpotEdit: Selective Region Editing in Diffusion Transformers
Paper
• 2512.22323
• Published
• 39
Rethinking Sample Polarity in Reinforcement Learning with Verifiable Rewards
Paper
• 2512.21625
• Published
• 4
Nested Browser-Use Learning for Agentic Information Seeking
Paper
• 2512.23647
• Published
• 19
GRAN-TED: Generating Robust, Aligned, and Nuanced Text Embedding for Diffusion Models
Paper
• 2512.15560
• Published
• 25
Distribution Matching Variational AutoEncoder
Paper
• 2512.07778
• Published
• 29
Self-Improving VLM Judges Without Human Annotations
Paper
• 2512.05145
• Published
• 20
OmniPSD: Layered PSD Generation with Diffusion Transformer
Paper
• 2512.09247
• Published
• 48
Learning from Next-Frame Prediction: Autoregressive Video Modeling Encodes Effective Representations
Paper
• 2512.21004
• Published
• 13
CASA: Cross-Attention via Self-Attention for Efficient Vision-Language Fusion
Paper
• 2512.19535
• Published
• 12
Multi-hop Reasoning via Early Knowledge Alignment
Paper
• 2512.20144
• Published
• 7
WorldWarp: Propagating 3D Geometry with Asynchronous Video Diffusion
Paper
• 2512.19678
• Published
• 30
Step-DeepResearch Technical Report
Paper
• 2512.20491
• Published
• 86
3D-RE-GEN: 3D Reconstruction of Indoor Scenes with a Generative Framework
Paper
• 2512.17459
• Published
• 12
Beyond Memorization: A Multi-Modal Ordinal Regression Benchmark to Expose Popularity Bias in Vision-Language Models
Paper
• 2512.21337
• Published
• 31
KV-Embedding: Training-free Text Embedding via Internal KV Re-routing in Decoder-only LLMs
Paper
• 2601.01046
• Published
• 14
DreamID-V:Bridging the Image-to-Video Gap for High-Fidelity Face Swapping via Diffusion Transformer
Paper
• 2601.01425
• Published
• 52
Talk2Move: Reinforcement Learning for Text-Instructed Object-Level Geometric Transformation in Scenes
Paper
• 2601.02356
• Published
• 14
VAR RL Done Right: Tackling Asynchronous Policy Conflicts in Visual Autoregressive Generation
Paper
• 2601.02256
• Published
• 33
Recursive Language Models
Paper
• 2512.24601
• Published
• 90
NextFlow: Unified Sequential Modeling Activates Multimodal Understanding and Generation
Paper
• 2601.02204
• Published
• 62
K-EXAONE Technical Report
Paper
• 2601.01739
• Published
• 92
mHC: Manifold-Constrained Hyper-Connections
Paper
• 2512.24880
• Published
• 312
OpenVision 3: A Family of Unified Visual Encoder for Both Understanding and Generation
Paper
• 2601.15369
• Published
• 21
Cosmos Policy: Fine-Tuning Video Models for Visuomotor Control and Planning
Paper
• 2601.16163
• Published
• 14
The Flexibility Trap: Why Arbitrary Order Limits Reasoning Potential in Diffusion Language Models
Paper
• 2601.15165
• Published
• 72
Learning to Discover at Test Time
Paper
• 2601.16175
• Published
• 42
ActionMesh: Animated 3D Mesh Generation with Temporal 3D Diffusion
Paper
• 2601.16148
• Published
• 12
Behavior Knowledge Merge in Reinforced Agentic Models
Paper
• 2601.13572
• Published
• 25
Scaling Text-to-Image Diffusion Transformers with Representation Autoencoders
Paper
• 2601.16208
• Published
• 52
Stable-DiffCoder: Pushing the Frontier of Code Diffusion Large Language Model
Paper
• 2601.15892
• Published
• 53
Agentic-R: Learning to Retrieve for Agentic Search
Paper
• 2601.11888
• Published
• 19
LaViT: Aligning Latent Visual Thoughts for Multi-modal Reasoning
Paper
• 2601.10129
• Published
• 12
Agentic Reasoning for Large Language Models
Paper
• 2601.12538
• Published
• 198
ABC-Bench: Benchmarking Agentic Backend Coding in Real-World Development
Paper
• 2601.11077
• Published
• 65
NAACL: Noise-AwAre Verbal Confidence Calibration for LLMs in RAG Systems
Paper
• 2601.11004
• Published
• 30
Language of Thought Shapes Output Diversity in Large Language Models
Paper
• 2601.11227
• Published
• 9
Collaborative Multi-Agent Test-Time Reinforcement Learning for Reasoning
Paper
• 2601.09667
• Published
• 91
OmniTransfer: All-in-one Framework for Spatio-temporal Video Transfer
Paper
• 2601.14250
• Published
• 47
LightOnOCR: A 1B End-to-End Multilingual Vision-Language Model for State-of-the-Art OCR
Paper
• 2601.14251
• Published
• 25
PhysRVG: Physics-Aware Unified Reinforcement Learning for Video Generative Models
Paper
• 2601.11087
• Published
• 11
Think Longer to Explore Deeper: Learn to Explore In-Context via Length-Incentivized Reinforcement Learning
Paper
• 2602.11748
• Published
• 30
Length-Unbiased Sequence Policy Optimization: Revealing and Controlling Response Length Variation in RLVR
Paper
• 2602.05261
• Published
• 49
LatentMem: Customizing Latent Memory for Multi-Agent Systems
Paper
• 2602.03036
• Published
• 14
Reinforced Attention Learning
Paper
• 2602.04884
• Published
• 28
3D-Aware Implicit Motion Control for View-Adaptive Human Video Generation
Paper
• 2602.03796
• Published
• 58
Dr. Kernel: Reinforcement Learning Done Right for Triton Kernel Generations
Paper
• 2602.05885
• Published
• 28
VLS: Steering Pretrained Robot Policies via Vision-Language Models
Paper
• 2602.03973
• Published
• 22
Efficient Autoregressive Video Diffusion with Dummy Head
Paper
• 2601.20499
• Published
• 8
CoBA-RL: Capability-Oriented Budget Allocation for Reinforcement Learning in LLMs
Paper
• 2602.03048
• Published
• 32
ReGuLaR: Variational Latent Reasoning Guided by Rendered Chain-of-Thought
Paper
• 2601.23184
• Published
• 36
CodeOCR: On the Effectiveness of Vision Language Models in Code Understanding
Paper
• 2602.01785
• Published
• 94
Balancing Understanding and Generation in Discrete Diffusion Models
Paper
• 2602.01362
• Published
• 16
Latent Chain-of-Thought as Planning: Decoupling Reasoning from Verbalization
Paper
• 2601.21358
• Published
• 7
daVinci-Agency: Unlocking Long-Horizon Agency Data-Efficiently
Paper
• 2602.02619
• Published
• 50
Vision-DeepResearch: Incentivizing DeepResearch Capability in Multimodal Large Language Models
Paper
• 2601.22060
• Published
• 157
PixelGen: Pixel Diffusion Beats Latent Diffusion with Perceptual Loss
Paper
• 2602.02493
• Published
• 43
RLAnything: Forge Environment, Policy, and Reward Model in Completely Dynamic RL System
Paper
• 2602.02488
• Published
• 32
MemOCR: Layout-Aware Visual Memory for Efficient Long-Horizon Reasoning
Paper
• 2601.21468
• Published
• 25
Innovator-VL: A Multimodal Large Language Model for Scientific Discovery
Paper
• 2601.19325
• Published
• 79
Memory-V2V: Augmenting Video-to-Video Diffusion Models with Memory
Paper
• 2601.16296
• Published
• 28
Self-Distillation Enables Continual Learning
Paper
• 2601.19897
• Published
• 26
Harder Is Better: Boosting Mathematical Reasoning via Difficulty-Aware GRPO and Multi-Aspect Question Reformulation
Paper
• 2601.20614
• Published
• 119
Generation Enhances Understanding in Unified Multimodal Models via Multi-Representation Generation
Paper
• 2601.21406
• Published
• 5
Reinforcement Learning via Self-Distillation
Paper
• 2601.20802
• Published
• 40
Visual Personalization Turing Test
Paper
• 2601.22680
• Published
• 2
TTCS: Test-Time Curriculum Synthesis for Self-Evolving
Paper
• 2601.22628
• Published
• 35
EgoActor: Grounding Task Planning into Spatial-aware Egocentric Actions for Humanoid Robots via Visual-Language Models
Paper
• 2602.04515
• Published
• 38
DynamicVLA: A Vision-Language-Action Model for Dynamic Object Manipulation
Paper
• 2601.22153
• Published
• 71
Beyond Imitation: Reinforcement Learning for Active Latent Planning
Paper
• 2601.21598
• Published
• 9
Linear representations in language models can change dramatically over a conversation
Paper
• 2601.20834
• Published
• 21
Post-LayerNorm Is Back: Stable, ExpressivE, and Deep
Paper
• 2601.19895
• Published
• 24
Teaching Models to Teach Themselves: Reasoning at the Edge of Learnability
Paper
• 2601.18778
• Published
• 40
Endless Terminals: Scaling RL Environments for Terminal Agents
Paper
• 2601.16443
• Published
• 18
iFSQ: Improving FSQ for Image Generation with 1 Line of Code
Paper
• 2601.17124
• Published
• 33