Using Scrapy with different / many proxies

Ref. to the previous post (Using Scrapy with proxies), I mentioned how to use a SINGLE proxy with Scrapy.

Now, what if you have different proxies ? here are a simple few changes to make it .

1. Add a new array with your proxies to your config file as follows :

PROXIES = [{'ip_port': 'xx.xx.xx.xx:xxxx', 'user_pass': 'foo:bar'},
           {'ip_port': 'PROXY2_IP:PORT_NUMBER', 'user_pass': 'username:password'},
           {'ip_port': 'PROXY3_IP:PORT_NUMBER', 'user_pass': ''},]

2. Update your middlewares.py file to the following :

import base64
import random
from settings import PROXIES

class ProxyMiddleware(object):
    def process_request(self, request, spider):
        proxy = random.choice(PROXIES)
        if proxy['user_pass'] is not None:
            request.meta['proxy'] = "http://%s" % proxy['ip_port']
            encoded_user_pass = base64.encodestring(proxy['user_pass'])
            request.headers['Proxy-Authorization'] = 'Basic ' + encoded_user_pass            
        else:
            request.meta['proxy'] = "http://%s" % proxy['ip_port']

That’s it :) !

Leave a comment

 

Anti-Spam Protection by WP-SpamFree