user2988577 - 13 days ago 5x
Python Question

# Get a random sample of a dict

I'm working with a big dictionary and for some reason I also need to work on small random samples from that dictionary. How can I get this small sample (for example of length 2)?

Here is a toy-model:

``````dy={'a':1, 'b':2, 'c':3, 'd':4, 'e':5}
``````

I need to perform some task on dy which involves all the entries. Let us say, to simplify, I need to sum together all the values:

``````s=0
for key in dy.key:
s=s+dy[key]
``````

Now, I also need to perform the same task on a random sample of dy; for that I need a random sample of the keys of dy. The simple solution I can imagine is

``````sam=list(dy.keys())[:1]
``````

In that way I have a list of two keys of the dictionary which are somehow random. So, going back to may task, the only change I need in the code is:

``````s=0
for key in sam:
s=s+dy[key]
``````

The point is I do not fully understand how dy.keys is constructed and then I can't foresee any future issue

``````dy = {'a':1, 'b':2, 'c':3, 'd':4, 'e':5}
``````

Then the sum of all the values is more simply put as:

``````s = sum(dy.values())
``````

Then if it's not memory prohibitive, you can sample using:

``````import random

values = list(dy.values())
s = sum(random.sample(values, 2))
``````

Or, since `random.sample` can take a `set`-like object, then:

``````from operator import itemgetter
import random

s = sum(itemgetter(*random.sample(dy.keys(), 2))(dy))
``````

Or just use:

``````s = sum(dy[k] for k in random.sample(dy.keys(), 2))
``````

An alternative is to use a `heapq`, eg:

``````import heapq
import random

s = sum(heapq.nlargest(2, dy.values(), key=lambda L: random.random()))
``````
Source (Stackoverflow)