DINOv3 Collection DINOv3: foundation models producing excellent dense features, outperforming SotA w/o fine-tuning - https://arxiv.org/abs/2508.10104 • 13 items • Updated Aug 21, 2025 • 473
An Empirical Study of Autoregressive Pre-training from Videos Paper • 2501.05453 • Published Jan 9, 2025 • 41
Learning Video Representations without Natural Videos Paper • 2410.24213 • Published Oct 31, 2024 • 16 • 2
Interpreting and Editing Vision-Language Representations to Mitigate Hallucinations Paper • 2410.02762 • Published Oct 3, 2024 • 9 • 2
Interpreting and Editing Vision-Language Representations to Mitigate Hallucinations Paper • 2410.02762 • Published Oct 3, 2024 • 9
Interpreting the Weight Space of Customized Diffusion Models Paper • 2406.09413 • Published Jun 13, 2024 • 20
Interpreting the Weight Space of Customized Diffusion Models Paper • 2406.09413 • Published Jun 13, 2024 • 20
Interpreting the Second-Order Effects of Neurons in CLIP Paper • 2406.04341 • Published Jun 6, 2024 • 2
Interpreting CLIP's Image Representation via Text-Based Decomposition Paper • 2310.05916 • Published Oct 9, 2023 • 2