view article Article LightOnOCR-2-1B: a lightweight high-performance end-to-end OCR model family 6 days ago • 60
TimeLens: Rethinking Video Temporal Grounding with Multimodal LLMs Paper • 2512.14698 • Published Dec 16, 2025 • 21
Perceptual Taxonomy: Evaluating and Guiding Hierarchical Scene Reasoning in Vision-Language Models Paper • 2511.19526 • Published Nov 24, 2025 • 2
view article Article Binary and Scalar Embedding Quantization for Significantly Faster & Cheaper Retrieval +1 Mar 22, 2024 • 125
view article Article Why We Built VIBE Bench: Rethinking Evaluation for Real Workloads 19 days ago • 6
view article Article Diversity Vs Density: A data strategy comparison for fine-tuning VLMs 20 days ago • 5
CHURRO: Making History Readable with an Open-Weight Large Vision-Language Model for High-Accuracy, Low-Cost Historical Text Recognition Paper • 2509.19768 • Published Sep 24, 2025 • 6
FiNERweb: Datasets and Artifacts for Scalable Multilingual Named Entity Recognition Paper • 2512.13884 • Published Dec 15, 2025 • 15
fiNERweb Collection A multilingual dataset for NER covering 91 langauges and 25 scripts • 3 items • Updated Dec 16, 2025 • 1
Datasets Wrapped 2025: Reasoning Collection The reasoning datasets that defined 2025. Part 1 of Datasets Wrapped 2025. #DatasetsWrapped2025 • 20 items • Updated Dec 16, 2025 • 1
NeMo Gym Collection Collection of RL verifiable data for NeMo Gym • 13 items • Updated 5 days ago • 38