Using Scrapy with different / many proxies

Ref. to the previous post (Using Scrapy with proxies), I mentioned how to use a SINGLE proxy with Scrapy.

Now, what if you have different proxies ? here are a simple few changes to make it .

1. Add a new array with your proxies to your config file as follows :

PROXIES = [{'ip_port': 'xx.xx.xx.xx:xxxx', 'user_pass': 'foo:bar'},
           {'ip_port': 'PROXY2_IP:PORT_NUMBER', 'user_pass': 'username:password'},
           {'ip_port': 'PROXY3_IP:PORT_NUMBER', 'user_pass': ''},]

2. Update your middlewares.py file to the following :

import base64
import random
from settings import PROXIES

class ProxyMiddleware(object):
    def process_request(self, request, spider):
        proxy = random.choice(PROXIES)
        if proxy['user_pass'] is not None:
            request.meta['proxy'] = "http://%s" % proxy['ip_port']
            encoded_user_pass = base64.encodestring(proxy['user_pass'])
            request.headers['Proxy-Authorization'] = 'Basic ' + encoded_user_pass            
        else:
            request.meta['proxy'] = "http://%s" % proxy['ip_port']

That’s it :) !

1 Comment

  • Assalamo alaykom brother,

    I do have a question for you. Did you ever tried to use public proxies to root your http requests? like this website: http://hidemyass.com/proxy-list/search-226731
    one can get proxies in a file and update them every hour and run scrapy over it !?
    I enjoy reading your posts. I did indeed search for how to use scrapy with random proxies and here we go! Could you please tell me more about what would one get in hosting to make a job portal…I do have basic knowledge in python and other programming languages and looking forward to start a small website to assure some halal money INCHA ALLAH.
    Best regards,
    Hicham

Leave a comment

 

WP-SpamFree by Pole Position Marketing