fulandiege 's Collections papers
updated
Towards Scalable Pre-training of Visual Tokenizers for Generation
Paper
• 2512.13687
• Published
• 106
MMGR: Multi-Modal Generative Reasoning
Paper
• 2512.14691
• Published
• 119
Coupling Experts and Routers in Mixture-of-Experts via an Auxiliary Loss
Paper
• 2512.23447
• Published
• 98
LiveTalk: Real-Time Multimodal Interactive Video Diffusion via Improved On-Policy Distillation
Paper
• 2512.23576
• Published
• 65
mHC: Manifold-Constrained Hyper-Connections
Paper
• 2512.24880
• Published
• 312
Youtu-LLM: Unlocking the Native Agentic Potential for Lightweight Large Language Models
Paper
• 2512.24618
• Published
• 151
PhyGDPO: Physics-Aware Groupwise Direct Preference Optimization for Physically Consistent Text-to-Video Generation
Paper
• 2512.24551
• Published
• 21
Let It Flow: Agentic Crafting on Rock and Roll, Building the ROME Model within an Open Agentic Learning Ecosystem
Paper
• 2512.24873
• Published
• 105
Improving Multi-step RAG with Hypergraph-based Memory for Long-Context Complex Relational Modeling
Paper
• 2512.23959
• Published
• 112
Agent Learning via Early Experience
Paper
• 2510.08558
• Published
• 273
Entropy-Adaptive Fine-Tuning: Resolving Confident Conflicts to Mitigate Forgetting
Paper
• 2601.02151
• Published
• 109
Thinking with Map: Reinforced Parallel Map-Augmented Agent for Geolocalization
Paper
• 2601.05432
• Published
• 167
MMFormalizer: Multimodal Autoformalization in the Wild
Paper
• 2601.03017
• Published
• 105
Qwen3-VL Technical Report
Paper
• 2511.21631
• Published
• 158
Wan-Move: Motion-controllable Video Generation via Latent Trajectory Guidance
Paper
• 2512.08765
• Published
• 133
Less is More: Recursive Reasoning with Tiny Networks
Paper
• 2510.04871
• Published
• 509
Advancing Open-source World Models
Paper
• 2601.20540
• Published
• 128
OPUS: Towards Efficient and Principled Data Selection in Large Language Model Pre-training in Every Iteration
Paper
• 2602.05400
• Published
• 341
ASA: Training-Free Representation Engineering for Tool-Calling Agents
Paper
• 2602.04935
• Published
• 41
Step 3.5 Flash: Open Frontier-Level Intelligence with 11B Active Parameters
Paper
• 2602.10604
• Published
• 185
Kimi K2.5: Visual Agentic Intelligence
Paper
• 2602.02276
• Published
• 251
GeoAgent: Learning to Geolocate Everywhere with Reinforced Geographic Characteristics
Paper
• 2602.12617
• Published
• 20
MedXIAOHE: A Comprehensive Recipe for Building Medical MLLMs
Paper
• 2602.12705
• Published
• 62
DeepImageSearch: Benchmarking Multimodal Agents for Context-Aware Image Retrieval in Visual Histories
Paper
• 2602.10809
• Published
• 52
SLA2: Sparse-Linear Attention with Learnable Routing and QAT
Paper
• 2602.12675
• Published
• 53
Unified Latents (UL): How to train your latents
Paper
• 2602.17270
• Published
• 54
Code2World: A GUI World Model via Renderable Code Generation
Paper
• 2602.09856
• Published
• 195
Mobile-Agent-v3.5: Multi-platform Fundamental GUI Agents
Paper
• 2602.16855
• Published
• 46
World Craft: Agentic Framework to Create Visualizable Worlds via Text
Paper
• 2601.09150
• Published
• 20
VESPO: Variational Sequence-Level Soft Policy Optimization for Stable Off-Policy LLM Training
Paper
• 2602.10693
• Published
• 215
Does Your Reasoning Model Implicitly Know When to Stop Thinking?
Paper
• 2602.08354
• Published
• 246
A Very Big Video Reasoning Suite
Paper
• 2602.20159
• Published
• 491
On Data Engineering for Scaling LLM Terminal Capabilities
Paper
• 2602.21193
• Published
• 88
Query-focused and Memory-aware Reranker for Long Context Processing
Paper
• 2602.12192
• Published
• 51
HyTRec: A Hybrid Temporal-Aware Attention Architecture for Long Behavior Sequential Recommendation
Paper
• 2602.18283
• Published
• 53