PubTables-1M: Towards comprehensive table extraction from unstructured
documents
Paper
•
2110.00061
•
Published
•
3
Optimized Table Tokenization for Table Structure Recognition
Paper
•
2305.03393
•
Published
•
1
Qwen3-VL Technical Report
Paper
•
2511.21631
•
Published
•
152
PaddleOCR-VL: Boosting Multilingual Document Parsing via a 0.9B Ultra-Compact Vision-Language Model
Paper
•
2510.14528
•
Published
•
113
PaddlePaddle/PaddleOCR-VL
Image-Text-to-Text
•
1.0B
•
Updated
•
15.9k
•
1.54k
DeepSeek-OCR: Contexts Optical Compression
Paper
•
2510.18234
•
Published
•
92
Image-Text-to-Text
•
3B
•
Updated
•
2.98M
•
3.13k
HunyuanOCR Technical Report
Paper
•
2511.19575
•
Published
•
22
Image-Text-to-Text
•
1.0B
•
Updated
•
1.51M
•
552
DocReward: A Document Reward Model for Structuring and Stylizing
Paper
•
2510.11391
•
Published
•
27
SynthDoc: Bilingual Documents Synthesis for Visual Document
Understanding
Paper
•
2408.14764
•
Published
OmniLayout: Enabling Coarse-to-Fine Learning with LLMs for Universal
Document Layout Generation
Paper
•
2510.26213
•
Published
•
10
MonkeyOCR v1.5 Technical Report: Unlocking Robust Document Parsing for Complex Patterns
Paper
•
2511.10390
•
Published
Structured Document Translation via Format Reinforcement Learning
Paper
•
2512.05100
•
Published
•
2
DeepSeek-OCR 2: Visual Causal Flow
Paper
•
2601.20552
•
Published
•
53
OCRVerse: Towards Holistic OCR in End-to-End Vision-Language Models
Paper
•
2601.21639
•
Published
•
48
PaddleOCR-VL-1.5: Towards a Multi-Task 0.9B VLM for Robust In-the-Wild Document Parsing
Paper
•
2601.21957
•
Published
•
14
MemOCR: Layout-Aware Visual Memory for Efficient Long-Horizon Reasoning
Paper
•
2601.21468
•
Published
•
20