Welcome to Scrapy
What is Scrapy?
Scrapy is a fast high-level screen scraping and web crawling framework, used to crawl websites and extract structured data from their pages. It can be used for a wide range of purposes, from data mining to monitoring and automated testing.
Features
- Simple
- Scrapy was designed with simplicity in mind, by providing the features you need without getting in your way
- Productive
- Just write the rules to extract the data from web pages and let Scrapy crawl the entire web site for you
- Fast
- Scrapy is used in production crawlers to completely scrape more than 500 retailer sites daily, all in one server
- Extensible
- Scrapy was designed with extensibility in mind and so it provides several mechanisms to plug new code without having to touch the framework core
- Portable, open-source, 100% Python
- Scrapy is completely written in Python and runs on Linux, Windows, Mac and BSD
- Batteries included
- Scrapy comes with lots of functionality built in. Check this section of the documentation for a list of them.
- Well-documented & well-tested
- Scrapy is extensively documented and has an comprehensive test suite with very good code coverage
- Good community and commercial support
- Scrapy has a healthy community and very good commercial support.
Still not sure if Scrapy is what you're looking for?. Check out Scrapy at a glance.
Companies using Scrapy
Scrapy is being used in large production environments, to crawl thousands of sites daily. Here is a list of Companies using Scrapy.
Where to start?
Start by reading Scrapy at a glance, then download Scrapy and follow the Tutorial.