hnvasa hnvasa - 4 months ago 17
Python Question

Remove newline in python with urllib

I am using Python 3.x. While using

urllib.request
to download the webpage, i am getting a lot of
\n
in between. I am trying to remove it using the methods given in the other threads of the forum, but i am not able to do so. I have used
strip()
function and the
replace()
function...but no luck! I am running this code on eclipse. Here is my code:

import urllib.request

#Downloading entire Web Document
def download_page(a):
opener = urllib.request.FancyURLopener({})
try:
open_url = opener.open(a)
page = str(open_url.read())
return page
except:
return""
raw_html = download_page("http://www.zseries.in")
print("Raw HTML = " + raw_html)

#Remove line breaks
raw_html2 = raw_html.replace('\n', '')
print("Raw HTML2 = " + raw_html2)


I am not able to spot out the reason of getting a lot of
\n
in the
raw_html
variable.

Answer

Seems like they are literal \n characters , so i suggest you to do like this.

raw_html2 = raw_html.replace('\\n', '')
Comments