PaddleOCR
PaddlePaddle/PaddleOCR
Turn any PDF or image document into structured data for your AI. A powerful, lightweight OCR toolkit that bridges the gap between images/PDFs and LLMs. Supports 100+ languages.
70,886stars
Forks
9,836
Open issues
292
Watchers
70,886
Size
1750.7 MB
PythonApache License 2.0
ai4sciencechineseocrdocument-parsingdocument-translationkieocrpaddleocr-vlpdf-extractor-ragpdf-parserpdf2markdownpp-ocrpp-structurerag
Created: May 8, 2020
Updated: Feb 18, 2026
Last push: Feb 16, 2026