⭐ Star AlbumentationsX on GitHub — 448+ stars and counting!

Star on GitHub
janhq

OpenCrawl

janhq/OpenCrawl

🌐 OpenCrawl: An ethical, high-performance web crawler built for scale A powerful web crawling library that respects robots.txt and rate limits while leveraging Kafka for high-throughput data processing. Built with ethics and efficiency in mind.

24stars
Forks
1
Open issues
0
Watchers
24
Size
0.2 MB
PythonApache License 2.0
data-processingkafkallm-integrationpythonrobots-txtweb-crawler
Created: Apr 1, 2025
Updated: May 22, 2026
Last push: Apr 3, 2025