Capattax Capattax - 10 months ago 93
Python Question

Python program that crawls for external IP address

I created a basic program to try and crawl a website for my external IP address with

BeautifulSoup 4
. Although, I keep getting an Attribute Error for my program because it can't obtain the string of a div class or whatever. It would appear as the specific div class does not exists and that it cannot therefore crawl it. I do know for a fact that it exists, even though it's saying it doesn't. Does anyone know what is wrong?

Here is my code:

import requests, sys, io
from html.parser import HTMLParser
from bs4 import BeautifulSoup

url = ""
sys.stdout = io.TextIOWrapper(sys.stdout.buffer, "cp437", "backslashreplace")
sourcecode = requests.get(url)
plaintext = sourcecode.text
soup = BeautifulSoup(plaintext, "html.parser")

tag = soup.find("span", {"style": "font-weight: bold; color:green;"})
ip = tag.string


It has nothing to do with Javascript, if you look at the source returned you can see:

<html style="height:100%"><head><META NAME="ROBOTS" CONTENT="NOINDEX, NOFOLLOW"><meta name="format-detection" content="telephone=no"><meta name="viewport" content="initial-scale=1.0"><meta http-equiv="X-UA-Compatible" content="IE=edge,chrome=1"></head><body style="margin:0px;height:100%"><iframe src="/_Incapsula_Resource?CWUDNSAI=24&xinfo=9-52943897-0 0NNN RT(1471643127529 69) q(0 -1 -1 -1) r(0 -1) B12(8,881022,0) U10000&incident_id=198001480102412051-472966643371608393&edet=12&cinfo=08000000" frameborder=0 width="100%" height="100%" marginheight="0px" marginwidth="0px">Request unsuccessful. Incapsula incident ID: 198001480102412051-472966643371608393</iframe></body></html>

They have detected that you are a bot and don't give you the source you expect.

You can get your ip and a lot more info using in json format:

url = ""
js = requests.get(url).json()

Or just your ip using httpbin:

url = ""
js = requests.get(url).json()