rojas - 11 months ago 43
Python Question

# Identifying values that sticks out

Given a dict:

``````data = {'18': [3.89, 1.28], '20': [1.39, 3.15], '15': [1.42, 3.10]}
``````

I want to pick out items that clearly differ from the rest as in
`18`
. Ideally I would specify
`ALLOWED_DISCREPANCY`
, setting it to
`0.5`
for demo, a threshold which categorizes what does and does not stick out (compared to rest of values).

The
`18`
with its
`3.89`
is clearly off here because the majority has values around 1.4 (comparing either value from each list is enough to conclude) and the difference (
`abs(3.89 - 1.4)`
) is greater than
`0.5`
(max allowed).

Answer Source

Compute the mean of the values.

``````>>> from numpy import mean
>>> data = {'18': [3.89, 1.28], '20': [1.39, 3.15], '15': [1.42, 3.10]}
>>> avg = mean([x for sublist in data.values() for x in sublist])
>>> avg
2.3716666666666666
``````

Set the threshold and build a new dictionary which maps the original keys to a list of values that match your constraint. Here's two examples:

``````>>> thresh = 0.5
>>> {k:[x for x in v if abs(x-avg) > thresh] for k, v in data.items()}
{'18': [3.89, 1.28], '15': [1.42, 3.1], '20': [1.39, 3.15]}
>>>
>>> thresh = 1
>>> {k:[x for x in v if abs(x-avg) > thresh] for k, v in data.items()}
{'18': [3.89, 1.28], '15': [], '20': []}
``````
Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download