Scrapy project
@scrapyOrganizationAn open source and collaborative framework for extracting the data you need from websites. In a fast, simple, yet extensible way.
On the leaderboard
| Rank | Repository | Stars |
|---|---|---|
| 270 | scrapy/scrapy | 61,075 |
Top repositories by stars
- scrapy/scrapy(on leaderboard)
Scrapy, a fast high-level web crawling & scraping framework for Python.
Python59,757 - scrapy/scrapyd
A service daemon to run Scrapy spiders
Python3,085 - scrapy/scrapely
A pure-python HTML screen-scraping library
HTML1,888 - scrapy/dirbot
Scrapy project to scrape public web directories (educational) [DEPRECATED]
Python1,630 - scrapy/quotesbot
This is a sample Scrapy project for educational purposes
Python1,353 - scrapy/parsel
Parsel lets you extract data from XML/HTML documents using XPath or CSS selectors
Python1,313 - scrapy/scrapyd-client
Command line client for Scrapyd server
Python778 - scrapy/w3lib
Python library of web-related functions
Python414 - scrapy/cssselect
CSS Selectors for Python
Python307 - scrapy/queuelib
Collection of persistent (disk-based) and non-persistent (memory-based) queues for Python
Python292 - scrapy/loginform
Fill HTML login forms automatically
Python277 - scrapy/protego
A pure-Python robots.txt parser with support for modern conventions.
DIGITAL Command Language79 - scrapy/itemadapter
Common interface for data container classes
Python68 - scrapy/scrapy.org
The scrapy.org website
HTML65 - scrapy/itemloaders
Library to populate items using XPath and CSS with a convenient API
Python47 - scrapy/booksbot
A crawler for http://books.toscrape.com
Python42 - scrapy/scrapy-bench
A CLI for benchmarking Scrapy.
Python32 - scrapy/scrapy-lint
A linter for Scrapy projects.
Python21 - scrapy/scurl
Performance-focused replacement for Python urllib
Python21 - scrapy/pypydispatcher
A fork of http://pydispatcher.sourceforge.net/ with PyPy support
Python16 - scrapy/xtractmime
https://mimesniff.spec.whatwg.org/ implementation for Python
Python13 - scrapy/base-chromium
base component forked from Chromium source https://chromium.googlesource.com/chromium/src/base/
C++7 - scrapy/scrapy-itemloader
[Archived] Library to populate Scrapy items using XPath and CSS with a convenient API
Python6 - scrapy/form2request
Python 3.8+ library to build HTTP requests out of HTML forms
Python4 - scrapy/url-chromium
url component from Chromium source code, forked from https://chromium.googlesource.com/chromium/src/url
C++3 - scrapy/gsoc2014-integration-tests
GSoC2014 - Scrapy Integration tests project
Shell3 - scrapy/scrapy-bench-speedcenter
Codespeed for scrapy-bench
Python2 - scrapy/sphinx-scrapy
Sphinx extension for documentation in the Scrapy ecosystem
Python1