Papers - a AlirezaDoC Collection

Models
Datasets
Spaces
Docs
Enterprise
Pricing
Log In
Sign Up

AlirezaDoC 's Collections

Papers

updated 10 days ago

Composition-RL: Compose Your Verifiable Prompts for Reinforcement Learning of Large Language Models

Paper • 2602.12036 • Published 17 days ago • 98
Reinforcement Learning for Self-Improving Agent with Skill Library

Paper • 2512.17102 • Published Dec 18, 2025 • 36
Diffusion Knows Transparency: Repurposing Video Diffusion for Transparent Object Depth and Normal Estimation

Paper • 2512.23705 • Published Dec 29, 2025 • 45
Schoenfeld's Anatomy of Mathematical Reasoning by Language Models

Paper • 2512.19995 • Published Dec 23, 2025 • 16
Dream-VL & Dream-VLA: Open Vision-Language and Vision-Language-Action Models with Diffusion Language Model Backbone

Paper • 2512.22615 • Published Dec 27, 2025 • 49
TimeBill: Time-Budgeted Inference for Large Language Models

Paper • 2512.21859 • Published Dec 26, 2025 • 25
Evaluating Parameter Efficient Methods for RLVR

Paper • 2512.23165 • Published Dec 29, 2025 • 28
Let It Flow: Agentic Crafting on Rock and Roll, Building the ROME Model within an Open Agentic Learning Ecosystem

Paper • 2512.24873 • Published Dec 31, 2025 • 105
Youtu-LLM: Unlocking the Native Agentic Potential for Lightweight Large Language Models

Paper • 2512.24618 • Published Dec 31, 2025 • 151
SpotEdit: Selective Region Editing in Diffusion Transformers

Paper • 2512.22323 • Published Dec 26, 2025 • 39
Rethinking Sample Polarity in Reinforcement Learning with Verifiable Rewards

Paper • 2512.21625 • Published Dec 25, 2025 • 4
Nested Browser-Use Learning for Agentic Information Seeking

Paper • 2512.23647 • Published Dec 29, 2025 • 19
GRAN-TED: Generating Robust, Aligned, and Nuanced Text Embedding for Diffusion Models

Paper • 2512.15560 • Published Dec 17, 2025 • 25
Distribution Matching Variational AutoEncoder

Paper • 2512.07778 • Published Dec 8, 2025 • 29
Self-Improving VLM Judges Without Human Annotations

Paper • 2512.05145 • Published Dec 2, 2025 • 20
OmniPSD: Layered PSD Generation with Diffusion Transformer

Paper • 2512.09247 • Published Dec 10, 2025 • 48
Learning from Next-Frame Prediction: Autoregressive Video Modeling Encodes Effective Representations

Paper • 2512.21004 • Published Dec 24, 2025 • 13
CASA: Cross-Attention via Self-Attention for Efficient Vision-Language Fusion

Paper • 2512.19535 • Published Dec 22, 2025 • 12
Multi-hop Reasoning via Early Knowledge Alignment

Paper • 2512.20144 • Published Dec 23, 2025 • 7
WorldWarp: Propagating 3D Geometry with Asynchronous Video Diffusion

Paper • 2512.19678 • Published Dec 22, 2025 • 30
Step-DeepResearch Technical Report

Paper • 2512.20491 • Published Dec 23, 2025 • 86
3D-RE-GEN: 3D Reconstruction of Indoor Scenes with a Generative Framework

Paper • 2512.17459 • Published Dec 19, 2025 • 12
Beyond Memorization: A Multi-Modal Ordinal Regression Benchmark to Expose Popularity Bias in Vision-Language Models

Paper • 2512.21337 • Published Dec 24, 2025 • 31
KV-Embedding: Training-free Text Embedding via Internal KV Re-routing in Decoder-only LLMs

Paper • 2601.01046 • Published Jan 3 • 14
DreamID-V:Bridging the Image-to-Video Gap for High-Fidelity Face Swapping via Diffusion Transformer

Paper • 2601.01425 • Published Jan 4 • 52
Talk2Move: Reinforcement Learning for Text-Instructed Object-Level Geometric Transformation in Scenes

Paper • 2601.02356 • Published Jan 5 • 14
VAR RL Done Right: Tackling Asynchronous Policy Conflicts in Visual Autoregressive Generation

Paper • 2601.02256 • Published Jan 5 • 33
Recursive Language Models

Paper • 2512.24601 • Published Dec 31, 2025 • 90
NextFlow: Unified Sequential Modeling Activates Multimodal Understanding and Generation

Paper • 2601.02204 • Published Jan 5 • 62
K-EXAONE Technical Report

Paper • 2601.01739 • Published Jan 5 • 92
mHC: Manifold-Constrained Hyper-Connections

Paper • 2512.24880 • Published Dec 31, 2025 • 312
OpenVision 3: A Family of Unified Visual Encoder for Both Understanding and Generation

Paper • 2601.15369 • Published Jan 21 • 21
Cosmos Policy: Fine-Tuning Video Models for Visuomotor Control and Planning

Paper • 2601.16163 • Published Jan 22 • 14
The Flexibility Trap: Why Arbitrary Order Limits Reasoning Potential in Diffusion Language Models

Paper • 2601.15165 • Published Jan 21 • 72
Learning to Discover at Test Time

Paper • 2601.16175 • Published Jan 22 • 42
ActionMesh: Animated 3D Mesh Generation with Temporal 3D Diffusion

Paper • 2601.16148 • Published Jan 22 • 12
Behavior Knowledge Merge in Reinforced Agentic Models

Paper • 2601.13572 • Published Jan 20 • 25
Scaling Text-to-Image Diffusion Transformers with Representation Autoencoders

Paper • 2601.16208 • Published Jan 22 • 52
Stable-DiffCoder: Pushing the Frontier of Code Diffusion Large Language Model

Paper • 2601.15892 • Published Jan 22 • 53
Agentic-R: Learning to Retrieve for Agentic Search

Paper • 2601.11888 • Published Jan 17 • 19
LaViT: Aligning Latent Visual Thoughts for Multi-modal Reasoning

Paper • 2601.10129 • Published Jan 15 • 12
Agentic Reasoning for Large Language Models

Paper • 2601.12538 • Published Jan 18 • 198
ABC-Bench: Benchmarking Agentic Backend Coding in Real-World Development

Paper • 2601.11077 • Published Jan 16 • 65
NAACL: Noise-AwAre Verbal Confidence Calibration for LLMs in RAG Systems

Paper • 2601.11004 • Published Jan 16 • 30
Language of Thought Shapes Output Diversity in Large Language Models

Paper • 2601.11227 • Published Jan 16 • 9
Collaborative Multi-Agent Test-Time Reinforcement Learning for Reasoning

Paper • 2601.09667 • Published Jan 14 • 91
OmniTransfer: All-in-one Framework for Spatio-temporal Video Transfer

Paper • 2601.14250 • Published Jan 20 • 47
LightOnOCR: A 1B End-to-End Multilingual Vision-Language Model for State-of-the-Art OCR

Paper • 2601.14251 • Published Jan 20 • 25
PhysRVG: Physics-Aware Unified Reinforcement Learning for Video Generative Models

Paper • 2601.11087 • Published Jan 16 • 11
Think Longer to Explore Deeper: Learn to Explore In-Context via Length-Incentivized Reinforcement Learning

Paper • 2602.11748 • Published 17 days ago • 30
Length-Unbiased Sequence Policy Optimization: Revealing and Controlling Response Length Variation in RLVR

Paper • 2602.05261 • Published 24 days ago • 49
LatentMem: Customizing Latent Memory for Multi-Agent Systems

Paper • 2602.03036 • Published 26 days ago • 14
Reinforced Attention Learning

Paper • 2602.04884 • Published 25 days ago • 28
3D-Aware Implicit Motion Control for View-Adaptive Human Video Generation

Paper • 2602.03796 • Published 26 days ago • 58
Dr. Kernel: Reinforcement Learning Done Right for Triton Kernel Generations

Paper • 2602.05885 • Published 24 days ago • 28
VLS: Steering Pretrained Robot Policies via Vision-Language Models

Paper • 2602.03973 • Published 26 days ago • 22
Efficient Autoregressive Video Diffusion with Dummy Head

Paper • 2601.20499 • Published Jan 28 • 8
CoBA-RL: Capability-Oriented Budget Allocation for Reinforcement Learning in LLMs

Paper • 2602.03048 • Published 26 days ago • 32
ReGuLaR: Variational Latent Reasoning Guided by Rendered Chain-of-Thought

Paper • 2601.23184 • Published 30 days ago • 36
CodeOCR: On the Effectiveness of Vision Language Models in Code Understanding

Paper • 2602.01785 • Published 27 days ago • 94
Balancing Understanding and Generation in Discrete Diffusion Models

Paper • 2602.01362 • Published 28 days ago • 16
Latent Chain-of-Thought as Planning: Decoupling Reasoning from Verbalization

Paper • 2601.21358 • Published Jan 29 • 7
daVinci-Agency: Unlocking Long-Horizon Agency Data-Efficiently

Paper • 2602.02619 • Published 27 days ago • 50
Vision-DeepResearch: Incentivizing DeepResearch Capability in Multimodal Large Language Models

Paper • 2601.22060 • Published about 1 month ago • 157
PixelGen: Pixel Diffusion Beats Latent Diffusion with Perceptual Loss

Paper • 2602.02493 • Published 27 days ago • 43
RLAnything: Forge Environment, Policy, and Reward Model in Completely Dynamic RL System

Paper • 2602.02488 • Published 27 days ago • 32
MemOCR: Layout-Aware Visual Memory for Efficient Long-Horizon Reasoning

Paper • 2601.21468 • Published Jan 29 • 25
Innovator-VL: A Multimodal Large Language Model for Scientific Discovery

Paper • 2601.19325 • Published Jan 27 • 79
Memory-V2V: Augmenting Video-to-Video Diffusion Models with Memory

Paper • 2601.16296 • Published Jan 22 • 28
Self-Distillation Enables Continual Learning

Paper • 2601.19897 • Published Jan 27 • 26
Harder Is Better: Boosting Mathematical Reasoning via Difficulty-Aware GRPO and Multi-Aspect Question Reformulation

Paper • 2601.20614 • Published Jan 28 • 119
Generation Enhances Understanding in Unified Multimodal Models via Multi-Representation Generation

Paper • 2601.21406 • Published Jan 29 • 5
Reinforcement Learning via Self-Distillation

Paper • 2601.20802 • Published Jan 28 • 40
Visual Personalization Turing Test

Paper • 2601.22680 • Published about 1 month ago • 2
TTCS: Test-Time Curriculum Synthesis for Self-Evolving

Paper • 2601.22628 • Published about 1 month ago • 35
EgoActor: Grounding Task Planning into Spatial-aware Egocentric Actions for Humanoid Robots via Visual-Language Models

Paper • 2602.04515 • Published 25 days ago • 38
DynamicVLA: A Vision-Language-Action Model for Dynamic Object Manipulation

Paper • 2601.22153 • Published about 1 month ago • 71
Beyond Imitation: Reinforcement Learning for Active Latent Planning

Paper • 2601.21598 • Published about 1 month ago • 9
Linear representations in language models can change dramatically over a conversation

Paper • 2601.20834 • Published Jan 28 • 21
Post-LayerNorm Is Back: Stable, ExpressivE, and Deep

Paper • 2601.19895 • Published Jan 27 • 24
Teaching Models to Teach Themselves: Reasoning at the Edge of Learnability

Paper • 2601.18778 • Published Jan 26 • 40
Endless Terminals: Scaling RL Environments for Terminal Agents

Paper • 2601.16443 • Published Jan 23 • 18
iFSQ: Improving FSQ for Image Generation with 1 Line of Code

Paper • 2601.17124 • Published Jan 23 • 33

Collection guide
Browse collections

Company

TOS Privacy About Careers

Website

Models Datasets Spaces Pricing Docs