One Layer Is Enough: Adapting Pretrained Visual Encoders for Image Generation Paper β’ 2512.07829 β’ Published Dec 8, 2025 β’ 21
GR-RL: Going Dexterous and Precise for Long-Horizon Robotic Manipulation Paper β’ 2512.01801 β’ Published Dec 1, 2025 β’ 23
Running on CPU Upgrade Featured 2.82k The Smol Training Playbook π 2.82k The secrets to building world-class LLMs
BEAST: Efficient Tokenization of B-Splines Encoded Action Sequences for Imitation Learning Paper β’ 2506.06072 β’ Published Jun 6, 2025 β’ 2
FLOWER: Democratizing Generalist Robot Policies with Efficient Vision-Language-Action Flow Policies Paper β’ 2509.04996 β’ Published Sep 5, 2025 β’ 14
NaviTrace: Evaluating Embodied Navigation of Vision-Language Models Paper β’ 2510.26909 β’ Published Oct 30, 2025 β’ 13
World Simulation with Video Foundation Models for Physical AI Paper β’ 2511.00062 β’ Published Oct 28, 2025 β’ 40
Unified Diffusion VLA: Vision-Language-Action Model via Joint Discrete Denoising Diffusion Process Paper β’ 2511.01718 β’ Published Nov 3, 2025 β’ 6
NaviTrace: Evaluating Embodied Navigation of Vision-Language Models Paper β’ 2510.26909 β’ Published Oct 30, 2025 β’ 13
NaviTrace: Evaluating Embodied Navigation of Vision-Language Models Paper β’ 2510.26909 β’ Published Oct 30, 2025 β’ 13 β’ 1
Do You Need Proprioceptive States in Visuomotor Policies? Paper β’ 2509.18644 β’ Published Sep 23, 2025 β’ 49
MANZANO: A Simple and Scalable Unified Multimodal Model with a Hybrid Vision Tokenizer Paper β’ 2509.16197 β’ Published Sep 19, 2025 β’ 56
FLOWER VLA Collection Collection of checkpoints for the FLOWER VLA policy. A small and versatile VLA for language-conditioned robot manipulation with less than 1B parameter β’ 10 items β’ Updated Sep 17, 2025 β’ 4
FLOWER: Democratizing Generalist Robot Policies with Efficient Vision-Language-Action Flow Policies Paper β’ 2509.04996 β’ Published Sep 5, 2025 β’ 14