OpenCrawl
janhq/OpenCrawl
🌐 OpenCrawl: An ethical, high-performance web crawler built for scale A powerful web crawling library that respects robots.txt and rate limits while leveraging Kafka for high-throughput data processing. Built with ethics and efficiency in mind.
15stars
Forks
1
Open issues
0
Watchers
15
Size
0.2 MB
PythonApache License 2.0
data-processingkafkallm-integrationpythonrobots-txtweb-crawler
Created: Apr 1, 2025
Updated: Feb 14, 2026
Last push: Apr 3, 2025