I created a Scrapy project with several spiders to crawl some websites. Now I want to use TOR to:
After a lot of research, I found a way to setup my Scrapy project to work with TOR on Windows OS:
The recent TOR's versions for Windows don't come with a graphical user interface (2). It is probably possible to setup TOR only through config files and cmd commands but for me, the best option was to use Vidalia. Download it (3) and unzip the files to a folder (ex. vidalia-standalone-0.2.21-win32). Run "Start Vidalia.exe" and go to Settings. On the "General" tab, point Vidalia to TOR (\tor-win32-0.2.6.10\Tor\tor.exe).
Check on "Advanced" tab and "Tor Configuration File" section the torrc file. I have the next ports configured:
ControlPort 9151 SocksPort 9050
Click Start Tor on the Vidalia Control Panel UI. After some processing you should se on the status the message "Connected to the Tor network!".
Download Polipo proxy (4) and unzip the files to a folder (ex. polipo-1.1.0-win32). Read about this proxy on the link 5.
Edit the file config.sample and add the next lines to it (in the beginning of the file, for example):
socksParentProxy = "localhost:9050" socksProxyType = socks5 diskCacheRoot = ""
Start Polipo through cmd. Go to the folder where you unzipped the files and enter the next command "polipo.exe -c config.sample".
Now you have Polipo and TOR up and running. Polipo will redirect any request to TOR through port 9050 with SOCKS protocol. Polipo will receive any HTTP request to redirect trough port 8123.
Now you can follow the rest of the tutorial "Torifying Scrapy Project On Ubuntu" (6). Continue in the step where the tutorial explains how to test the TOR/Polipo communications.