Mansoor Akram Mansoor Akram - 3 months ago 21
JSON Question

Trying to get json data from URL using Python

I am learning to get json data from a link and use that data later on. But i am getting error: "RuntimeError: maximum recursion depth exceeded while calling a Python object"

Here is my code:

import json
import requests
from bs4 import BeautifulSoup

url = "http://example.com/category/page=2&YII_CSRF_TOKEN=31eb0a5d28f4dde909d3233b5a0c23bd03348f69&more_products=true"
header = {'x-requested-with': 'XMLHttpRequest'}

mainPage = requests.get(url, headers = header)
xTree = BeautifulSoup(mainPage.content, "lxml")

newDictionary=json.loads(str(xTree))

print (newDictionary)


EDIT: Okay I got the response data from using this slight change, here is the new code:

import json
import requests
from bs4 import BeautifulSoup

url = "http://example.com/category/page=2&YII_CSRF_TOKEN=31eb0a5d28f4dde909d3233b5a0c23bd03348f69&more_products=true"
header = {'x-requested-with': 'XMLHttpRequest'}

mainPage = requests.get(url, headers = header

print (mainPage.json())

Answer

Don't use beautiful soup to process a json http response. Use something like requests:

url = "https://www.daraz.pk/womens-kurtas-shalwar-kameez/?pathInfo=womens-kurtas-shalwar-kameez&page=2&YII_CSRF_TOKEN=31eb0a5d28f4dde909d3233b5a0c23bd03348f69&more_products=true"
header = {'x-requested-with': 'XMLHttpRequest'}
t = requests.get(url, headers=True)
newDictionary=json.loads(t)
print (newDictionary)

The beautiful soup object can't be parsed with json.loads() that way.

If you have HTML data on some of those json keys then you can use beautiful soup to parse those string values individually. If you have a key called content on your json, containing html, you can parse it like so:

BeautifulSoup(newDictionary.content, "lxml")

You may need to experiment with different parsers, if you have fragmentary html.