Mark Mark - 4 months ago 14
JSON Question

How to have Counter count the correct strings

I am trying to get the counter to count which date appears most in the code below.

from collections import Counter

with open('dates.json', 'rb') as f:
data = f.readlines()

c = Counter(data)
print (c.most_common()[:10])


the JSON data is stored as a list like

["Sun Aug 07 01:50:13 +0000 2016", "Sun Aug 07 01:50:13 +0000 2016", "Sun Aug 07 01:50:13 +0000 2016", "Sun Aug 07 01:50:13 +0000 2016", "Sun Aug 07 01:50:13 +0000 2016", "Sun Aug 07 01:50:13 +0000 2016", "Sun Aug 07 01:50:13 +0000 2016", "Sun Aug 07 01:50:13 +0000 2016", "Sun Aug 07 01:50:13 +0000 2016"]


I would expect the output to be something similar to this (grabbed from another program)

[('Sun Aug 07 02:29:45 +0000 2016', 4), ('Sun Aug 07 02:31:05 +0000 2016', 4), ('Sun Aug 07 02:31:04 +0000 2016', 3), ('Sun Aug 07 02:31:08 +0000 2016', 3), ('Sun Aug 07 02:31:22 +0000 2016', 3)]


But this is my output

[(48, 72), (32, 53), (49, 27), (34, 18), (117, 18), (58, 18), (65, 9), (51, 9), (103, 9), (43, 9)]


I dont really understand what its counting there

Answer

Instead of readlines(), you should use json.load() to load the JSON data into a Python list:

import json

with open('dates.json', 'r') as f:
    data = json.load(f)