Mounarajan Mounarajan - 3 months ago 13
Python Question

Python 'utf8' codec can't decode byte 0xc3 in position 72: invalid continuation byte

I am crawling a particular url from google.com but i get some error

'utf8' codec can't decode byte 0xc3 in position 72: invalid continuation byte


Code:

import re
import os
import MySQLdb
import codecs
import requests
import base64
import random
import gzip
import time
from multiprocessing.pool import Pool
import datetime
import time

import sys
reload(sys)
sys.setdefaultencoding('utf-8')
def proxy_mesh():
while True:
try:

data = requests.get('google.com')

print data.text.encode('utf-8')
except Exception, e:
print e
print "Trying again"
time.sleep(3)
proxy_mesh()


What is the FIX and how to over come this error?

Answer

Keep it simple and it works. The data has already been decoded by the requests module.

import requests
data = requests.get('https://www.whoisxmlapi.com/whoisserver/WhoisService?domainName=http://N%E2%94%9CO-RESPONDER@MERCAOLIVRE.COM&outputFormat=json')
print data.text

Since it is a JSON response, you may also want to process it:

import json
print json.loads(data.text)