https://github.com/scrapy/scrapy
crawler crawling framework hacktoberfest python scraping web-scraping web-scraping-python
Score: 33.00590597501729
Last synced: about 8 hours ago
JSON representation
Repository metadata:
Scrapy, a fast high-level web crawling & scraping framework for Python.
- Host: GitHub
- URL: https://github.com/scrapy/scrapy
- Owner: scrapy
- License: bsd-3-clause
- Created: 2010-02-22T02:01:14.000Z (almost 16 years ago)
- Default Branch: master
- Last Pushed: 2026-01-31T15:48:54.000Z (5 days ago)
- Last Synced: 2026-02-02T01:09:45.278Z (4 days ago)
- Topics: crawler, crawling, framework, hacktoberfest, python, scraping, web-scraping, web-scraping-python
- Language: Python
- Homepage: https://scrapy.org
- Size: 27.8 MB
- Stars: 59,620
- Watchers: 1,763
- Forks: 11,226
- Open Issues: 667
-
Metadata Files:
- Readme: README.rst
- Changelog: NEWS
- Contributing: CONTRIBUTING.md
- License: LICENSE
- Code of conduct: CODE_OF_CONDUCT.md
- Security: SECURITY.md
- Authors: AUTHORS
Owner metadata:
- Name: Scrapy project
- Login: scrapy
- Email:
- Kind: organization
- Description: An open source and collaborative framework for extracting the data you need from websites. In a fast, simple, yet extensible way.
- Website: https://scrapy.org
- Location:
- Twitter:
- Company:
- Icon url: https://avatars.githubusercontent.com/u/733635?v=4
- Repositories: 26
- Last Synced at: 2024-03-25T19:52:43.990Z
- Profile URL: https://github.com/scrapy
Committers metadata
Last synced: 3 days ago
Total Commits: 8,866
Total Committers: 672
Avg Commits per committer: 13.193
Development Distribution Score (DDS): 0.792
Commits in past year: 256
Committers in past year: 31
Avg Commits per committer in past year: 8.258
Development Distribution Score (DDS) in past year: 0.316
| Name | Commits | |
|---|---|---|
| Pablo Hoffman | p****o@p****m | 1843 |
| Andrey Rakhmatullin | w****r@w****e | 836 |
| Daniel Graña | d****a@g****m | 743 |
| Adrián Chaves | a****n@c****o | 509 |
| Eugenio Lacuesta | e****a@g****m | 455 |
| elpolilla | n****e@n****e | 420 |
| Paul Tremberth | p****h@g****m | 392 |
| Mikhail Korobov | k****4@g****m | 339 |
| Ismael Carnales | i****s@g****m | 248 |
| Julia Medina | w****a@g****m | 174 |
| Konstantin Lopuhin | k****n@g****m | 133 |
| Laerte Pereira | l****k@g****m | 120 |
| Elias Dorneles | e****s@g****m | 116 |
| Aditya | k****0@g****m | 68 |
| Vostretsov Nikita | w****n@g****m | 68 |
| nyov | n****v@n****t | 60 |
| Jakob de Maeyer | j****1@g****m | 59 |
| Rolando Espinoza La fuente | d****o@g****m | 54 |
| Jalil SA | 6****l | 41 |
| Victor Torres | v****s@g****m | 38 |
| BroodingKangaroo | j****3@g****m | 35 |
| Edwin O Marshall | e****5@g****m | 32 |
| nramirezuy | n****y@g****m | 31 |
| Pawel Miech | p****m@g****m | 31 |
| Georgiy Zatserklianyi | g****y@g****m | 30 |
| Νικόλαος-Διγενής Καραγιάννης | d****l@g****m | 30 |
| Anubhav Patel | a****8@g****m | 30 |
| Maram Sumanth | m****h@g****m | 29 |
| GeorgeA92 | g****y@g****m | 29 |
| Valdir Stumm Junior | s****r@g****m | 29 |
| and 642 more... | ||
Issue and Pull Request metadata
Last synced: 2 days ago
Total issues: 655
Total pull requests: 1,433
Average time to close issues: over 1 year
Average time to close pull requests: 8 months
Total issue authors: 357
Total pull request authors: 427
Average comments per issue: 4.28
Average comments per pull request: 2.26
Merged pull request: 658
Bot issues: 0
Bot pull requests: 0
Past year issues: 116
Past year pull requests: 449
Past year average time to close issues: 11 days
Past year average time to close pull requests: 7 days
Past year issue authors: 43
Past year pull request authors: 117
Past year average comments per issue: 1.99
Past year average comments per pull request: 1.49
Past year merged pull request: 206
Past year bot issues: 0
Past year bot pull requests: 0
Top Issue Authors
- wRAR (129)
- Gallaecio (52)
- mohmad-null (26)
- kmike (18)
- Ehsan-U (11)
- Prometheus3375 (11)
- pawelmhm (5)
- elacuesta (5)
- GeorgeA92 (5)
- starrify (4)
- ghost (4)
- ddebernardy (4)
- lopuhin (4)
- Urahara (4)
- jtoallen (4)
Top Pull Request Authors
- wRAR (404)
- Gallaecio (124)
- Laerte (53)
- Rotzbua (14)
- jxlil (14)
- mlmsmith (14)
- GeorgeA92 (13)
- Rohitkr117 (10)
- Sintivrousai (10)
- elramen (9)
- MehrazRumman (9)
- kumar-sanchay (9)
- thalissonvs (9)
- mery16q (9)
- LucasSD (8)
Top Issue Labels
- enhancement (225)
- bug (112)
- good first issue (69)
- docs (63)
- discuss (58)
- cleanup (45)
- CI (40)
- asyncio (28)
- help wanted (20)
- needs more info (19)
- media pipelines (17)
- logging (15)
- upstream issue (13)
- performance (13)
- http (10)
- https (10)
- security (7)
- link extraction (6)
- not reproducible (5)
- typing (5)
- Windows (5)
- contracts (4)
- stale (3)
- HTTP/2 (3)
- backward-incompatible (2)
- gsoc-candidate (2)
- patch available (1)
- macOS (1)
- waiting feedback (1)
- S3 (1)
Top Pull Request Labels
- enhancement (203)
- CI (163)
- cleanup (138)
- bug (75)
- typing (70)
- docs (56)
- asyncio (50)
- spam (25)
- discuss (13)
- media pipelines (8)
- needs more info (8)
- hacktoberfest-accepted (7)
- backward-incompatible (5)
- logging (5)
- security (4)
- https (3)
- upstream issue (3)
- Windows (3)
- link extraction (3)
- in progress (3)
- contracts (2)
- help wanted (2)
- S3 (1)
- item loaders (1)
- waiting feedback (1)
- stale (1)
- http (1)
- macOS (1)
- performance (1)
Package metadata
- Total packages: 8
-
Total downloads:
- pypi: 4,737,940 last-month
- Total docker downloads: 588,557
- Total dependent packages: 142 (may contain duplicates)
- Total dependent repositories: 2,831 (may contain duplicates)
- Total versions: 169
- Total maintainers: 8
- Total advisories: 15
pypi.org: scrapy
A high-level Web Crawling and Web Scraping framework
- Homepage: https://scrapy.org/
- Documentation: https://docs.scrapy.org/
- Licenses: BSD-3-Clause
- Latest release: 2.14.1 (published 24 days ago)
- Last Synced: 2026-02-04T17:23:48.426Z (1 day ago)
- Versions: 107
- Dependent Packages: 136
- Dependent Repositories: 2,753
- Downloads: 4,737,749 Last month
- Docker Downloads: 588,557
-
Rankings:
- Stargazers count: 0.021%
- Forks count: 0.05%
- Dependent packages count: 0.154%
- Dependent repos count: 0.202%
- Downloads: 0.273%
- Average: 0.358%
- Docker downloads count: 1.447%
- Maintainers (4)
-
Advisories:
- Duplicate Advisory: Scrapy leaks the authorization header on same-domain but cross-origin redirects
- Scrapy allows redirect following in protocols other than HTTP
- Scrapy's redirects ignoring scheme-specific proxy settings
- Scrapy leaks the authorization header on same-domain but cross-origin redirects
- Duplicate Advisory: Scrapy decompression bomb vulnerability
- Duplicate Advisory: Scrapy authorization header leakage on cross-domain redirect
- Duplicate Advisory: ReDos vulnerability of XMLFeedSpider
- Scrapy decompression bomb vulnerability
- Scrapy authorization header leakage on cross-domain redirect
- Scrapy vulnerable to ReDoS via XMLFeedSpider
- Scrapy before 2.6.2 and 1.8.3 vulnerable to one proxy sending credentials to another
- Scrapy denial of service vulnerability
- Scrapy cookie-setting is not restricted based on the public suffix list
- Incorrect Authorization and Exposure of Sensitive Information to an Unauthorized Actor in scrapy
- Scrapy HTTP authentication credentials potentially leaked to target websites
conda-forge.org: scrapy
Scrapy is an open source and collaborative framework for extracting the data you need from websites in a fast, simple, yet extensible way.
- Homepage: https://scrapy.org/
- Licenses: BSD-3-Clause-Clear
- Latest release: 2.7.1 (published over 3 years ago)
- Last Synced: 2026-01-30T06:27:58.397Z (7 days ago)
- Versions: 31
- Dependent Packages: 4
- Dependent Repositories: 38
-
Rankings:
- Stargazers count: 0.245%
- Forks count: 0.903%
- Average: 4.883%
- Dependent repos count: 5.884%
- Dependent packages count: 12.501%
pypi.org: pylab-utils
python utility tools
- Homepage: https://github.com/scrapy/scrapy
- Documentation: https://pylab-utils.readthedocs.io/
- Licenses: BSD
- Latest release: 0.5 (published over 4 years ago)
- Last Synced: 2026-01-30T06:26:29.416Z (7 days ago)
- Versions: 2
- Dependent Packages: 0
- Dependent Repositories: 1
- Downloads: 42 Last month
-
Rankings:
- Stargazers count: 0.02%
- Forks count: 0.049%
- Dependent packages count: 7.373%
- Average: 9.527%
- Downloads: 17.961%
- Dependent repos count: 22.233%
- Maintainers (1)
anaconda.org: scrapy
Scrapy is an open source and collaborative framework for extracting the data you need from websites in a fast, simple, yet extensible way.
- Homepage: https://scrapy.org
- Licenses: BSD-3-Clause
- Latest release: 2.13.4 (published 2 months ago)
- Last Synced: 2026-01-30T06:26:57.651Z (7 days ago)
- Versions: 15
- Dependent Packages: 2
- Dependent Repositories: 38
-
Rankings:
- Stargazers count: 0.666%
- Forks count: 2.511%
- Average: 12.511%
- Dependent packages count: 20.454%
- Dependent repos count: 26.415%
pypi.org: scrapy-hls
scrapy integration for m3u8 files
- Homepage: https://github.com/scrapy/scrapy
- Documentation: https://scrapy-hls.readthedocs.io/
- Licenses: BSD
- Latest release: 0.1 (published almost 5 years ago)
- Last Synced: 2026-01-30T06:26:31.956Z (7 days ago)
- Versions: 1
- Dependent Packages: 0
- Dependent Repositories: 1
- Downloads: 11 Last month
-
Rankings:
- Stargazers count: 0.02%
- Forks count: 0.049%
- Dependent packages count: 7.373%
- Average: 18.879%
- Dependent repos count: 22.233%
- Downloads: 64.72%
- Maintainers (1)
pypi.org: bf-scrapy-base
基于scrapy的二次开发
- Homepage: https://scrapy.org/
- Documentation: https://docs.scrapy.org/
- Licenses: BSD-3-Clause
- Latest release: 0.0.6 (published 5 months ago)
- Last Synced: 2026-01-30T06:26:57.359Z (7 days ago)
- Versions: 6
- Dependent Packages: 0
- Dependent Repositories: 0
- Downloads: 26 Last month
-
Rankings:
- Dependent packages count: 8.63%
- Average: 28.63%
- Dependent repos count: 48.631%
- Maintainers (1)
pypi.org: scrapy-qfm
A high-level Web Crawling and Web Scraping framework
- Homepage: https://scrapy.org
- Documentation: https://docs.scrapy.org/
- Licenses: BSD
- Latest release: 2.11.2 (published about 1 year ago)
- Last Synced: 2026-01-30T06:26:31.028Z (7 days ago)
- Versions: 2
- Dependent Packages: 0
- Dependent Repositories: 0
- Downloads: 33 Last month
-
Rankings:
- Dependent packages count: 10.019%
- Average: 33.204%
- Dependent repos count: 56.388%
- Maintainers (1)
pypi.org: aminer-scrapy
A high-level Web Crawling and Web Scraping framework
- Homepage: https://scrapy.org
- Documentation: https://docs.scrapy.org/
- Licenses: BSD
- Latest release: 2.11.1 (published about 2 years ago)
- Last Synced: 2026-01-30T06:26:27.389Z (7 days ago)
- Versions: 5
- Dependent Packages: 0
- Dependent Repositories: 0
- Downloads: 79 Last month
-
Rankings:
- Dependent packages count: 10.037%
- Average: 38.143%
- Dependent repos count: 66.249%
- Maintainers (1)
Dependencies
- actions/checkout v3 composite
- actions/setup-python v4 composite
- pre-commit/action v3.0.0 composite
- actions/checkout v3 composite
- actions/setup-python v4 composite
- pypa/gh-action-pypi-publish v1.6.4 composite
- actions/checkout v3 composite
- actions/setup-python v4 composite
- actions/checkout v3 composite
- actions/setup-python v4 composite
- actions/checkout v3 composite
- actions/setup-python v4 composite
- sphinx ==5.0.2
- sphinx-hoverxref ==1.1.1
- sphinx-notfound-page ==0.8
- sphinx-rtd-theme ==1.0.0
- cryptography >=37.0.0
- cssselect >=0.9.1
- defusedxml >=0.7.1
- itemadapter >=0.1.0
- itemloaders >=1.0.1
- lxml >=4.6.4
- packaging *
- parsel >=1.5.0
- protego >=0.1.15
- pydispatcher >=2.0.5; platform_python_implementation == "CPython"
- pyopenssl >=22.0.0
- pypydispatcher >=2.1.0; platform_python_implementation == "PyPy"
- queuelib >=1.4.2
- service-identity >=18.1.0
- tldextract *
- twisted >=21.7.0,<=25.5.0
- w3lib >=1.17.0
- zope-interface >=5.1.0