icycandy icycandy - 1 month ago 4
Python Question

python sort list of json by value

I have a file consists of JSON, each a line, and want to sort the file by update_time reversed.

sample JSON file:

{ "page": { "url": "url1", "update_time": "1415387875"}, "other_key": {} }
{ "page": { "url": "url2", "update_time": "1415381963"}, "other_key": {} }
{ "page": { "url": "url3", "update_time": "1415384938"}, "other_key": {} }


want output:

{ "page": { "url": "url1", "update_time": "1415387875"}, "other_key": {} }
{ "page": { "url": "url3", "update_time": "1415384938"}, "other_key": {} }
{ "page": { "url": "url2", "update_time": "1415381963"}, "other_key": {} }


my code:

#!/bin/env python
#coding: utf8

import sys
import os
import json
import operator

#load json from file
lines = []
while True:
line = sys.stdin.readline()
if not line: break
line = line.strip()
json_obj = json.loads(line)
lines.append(json_obj)

#sort json
lines = sorted(lines, key=lambda k: k['page']['update_time'], reverse=True)

#output result
for line in lines:
print line


The code works fine with sample JSON file, but if a JSON has no 'update_time', it will raise KeyError exception. Are there non-exception ways to do this?

Answer

The answer should be obvious: Write a function that uses try...except to handle the KeyError, then use this as the key argument instead of your lambda.

def extract_time(json):
    try:
        # Also convert to int since update_time will be string.  When comparing
        # strings, "10" is smaller than "2".
        return int(json['page']['update_time'])
    except KeyError:
        return 0

# lines.sort() is more efficient than lines = lines.sorted()
lines.sort(key=extract_time, reverse=True)
Comments