data_garden data_garden - 1 month ago 11
Python Question

sorting dictionary by numeric value

I have a

dict
of music genres:

tag_weight = {'industrial': '533621', 'indie': '1971962', 'metal': '1213678', 'heavy metal': '652471', 'japanese': '428102', 'pop': '1873806', 'new wave': '399507', 'black metal': '772132', 'rap': '513024', 'ambient': '1030414', 'alternative': '2059313', 'hard rock': '820796', 'electronic': '2288563', 'blues': '531045', 'folk': '882178', 'classic rock': '1123712', 'alternative rock': '1123488', '90s': '447671', 'indie rock': '850515', 'death metal': '671118', 'electronica': '614494', 'female vocalists': '1557702', 'Soundtrack': '529406', 'dance': '769039', 'funk': '399843', 'psychedelic': '458710', '80s': '751871', 'piano': '409931', 'chillout': '636088', 'post-rock': '426516', 'punk rock': '518515', 'jazz': '1117114', 'seen live': '2097509', 'instrumental': '817816', 'singer-songwriter': '810185', 'metalcore': '444383', 'hardcore': '656111', 'Hip-Hop': '814630', 'hip hop': '394989', 'Classical': '539190', 'punk': '848955', 'soul': '641095', 'british': '667559', 'thrash metal': '465163', 'Progressive metal': '407220', 'rock': '3879179', 'acoustic': '460841', 'german': '409030', 'Progressive rock': '693480', 'experimental': '1010190'}


And I would like to tag them by popularity, that is,
sorting by value
, from most to less popular.

since
dicts
are unordered by nature, I must use
tuples
for that, and I've been trying to use this:

sorted_dict = sorted(tag_weight.items(), key=operator.itemgetter(0), reverse=True)


but it does not seem to be working, because it returns:

[('thrash metal', '465163'), ('soul', '641095'), ('singer-songwriter', '810185'), ('seen live', '2097511'), ('rock', '3879179'), ('rap', '513024'), ('punk rock', '518515'), ('punk', '848955'), ('psychedelic', '458710'), ('post-rock', '426516'), ('pop', '1873806'), ('piano', '409931'), ('new wave', '399507'), ('metalcore', '444383'), ('metal', '1213678'), ('jazz', '1117114'), ('japanese', '428102'), ('instrumental', '817816'), ('industrial', '533621'), ('indie rock', '850515'), ('indie', '1971962'), ('hip hop', '394989'), ('heavy metal', '652471'), ('hardcore', '656111'), ('hard rock', '820796'), ('german', '409030'), ('funk', '399843'), ('folk', '882178'), ('female vocalists', '1557702'), ('experimental', '1010190'), ('electronica', '614494'), ('electronic', '2288563'), ('death metal', '671118'), ('dance', '769039'), ('classic rock', '1123712'), ('chillout', '636088'), ('british', '667559'), ('blues', '531045'), ('black metal', '772132'), ('ambient', '1030414'), ('alternative rock', '1123488'), ('alternative', '2059313'), ('acoustic', '460841'), ('Soundtrack', '529406'), ('Progressive rock', '693480'), ('Progressive metal', '407220'), ('Hip-Hop', '814630'), ('Classical', '539190'), ('90s', '447671'), ('80s', '751871')]


and I guess
('rock', '3879179')
should be top on the list.

what am I doing wrong?

Answer

Use collections.Counter which is built for this purpose:

import collections

# Convert values to int
tag_weight = {k: int(v) for k, v in tag_weight.items()}  
count = collections.Counter(tag_weight)

# Print the top 10
print count.most_common(10)

# Print all, from most popular to least
print count.most_common()

Output of top 10:

[('rock', 3879179), ('electronic', 2288563), ('seen live', 2097509), ('alternative', 2059313), ('indie', 1971962), ('pop', 1873806), ('female vocalists', 1557702), ('metal', 1213678), ('classic rock', 1123712), ('alternative rock', 1123488)]