As am working on a Scrapy project, I wanted to store all spider statistics to Database so as I can access it later, So I wrote the following extension.
Ref. to the previous post (Using Scrapy with proxies), I mentioned how to use a SINGLE proxy with Scrapy.
Now, what if you have different proxies ? here are a simple few changes to make it .
1. Add a new array with your proxies to your config file as follows :
2. Update your middlewares.py file to the following :
That’s it 🙂 !
I’m working currently on a scraping some websites for B-kam.com. I used to develop in PHP but when I searched for best scraping / crawling, I found Scrapy (written in Python) is the best. You can read more about it and how to start here : I searched a lot for how to use proxies with Scrapy but couldn’t find simple / Straight forward way to do it. All are talking about Middlewares and...
The “History of the Internet” is an animated documentary explaining all the events and technologies that led to the invention of the Internet. A fascinating watch!