tika
apache/tika
The Apache Tika toolkit detects and extracts metadata and text from over a thousand different file types (such as PPT, XLS, and PDF).
3,583stars
Forks
912
Open issues
63
Watchers
3,583
Size
420.2 MB
JavaApache License 2.0
contentextractionjavametadatatika
Created: May 21, 2009
Updated: Feb 27, 2026
Last push: Feb 27, 2026