This tutorial https://www.dataquest.io/blog/python-json-tutorial/ has a 600MB file that they work with, however when I run their code
filename = "md_traffic.json"
with open(filename, 'r') as f:
objects = ijson.items(f, 'meta.view.columns.item')
columns = list(objects)
This looks like a direct copy/paste of the tutorial found here:
The reason it's taking so long is the
list() around the output of the
ijson.items function. This effectively forces parsing of the entire file before returning any results. Taking advantage of the
ijson.items being a generator, the first result can be returned almost immediately:
import ijson filename = "md_traffic.json" with open(filename, 'r') as f: for item in ijson.items(f, 'meta.view.columns.item'): print(item) break
EDIT: The very next step in the tutorial is
print(columns), which is why I included printing the first item in the answer. Also, it's not clear whether the question was for Python 2 or 3, so the answer uses syntax that works in both, albeit inelegantly.