Star AlbumentationsX on GitHub — it powers this leaderboard
tesseract-ocr/langdata
Source training data for Tesseract for lots of languages