Maus - 1 year ago 75
SQL Question

# Group List of tuples by the same tuple (timestamp) with pyhton itertools

I got the following un-normalized list of tuples as result from a SQL-query.

``````data = [(...) {u'timestamp': datetime.datetime(2016, 5, 25, 23, 55, 33), u'sensor': u'GP_PRES', u'value': u'96587'}, {u'timestamp': datetime.datetime(2016, 5, 25, 23, 56, 33), u'sensor': u'GP_NO2', u'value': u'0.221'}, {u'timestamp': datetime.datetime(2016, 5, 25, 23, 56, 33), u'sensor': u'GP_TC', u'value': u'14.600'}, {u'timestamp': datetime.datetime(2016, 5, 25, 23, 56, 33), u'sensor': u'GP_HUM', u'value': u'83.94'}, (...) ]
``````

These are Sensor-Data from several sensors but with the same timestamp.

I would like to group all the sensor values with the same timestamp. So that I have a list of tuples for every equal timestamp.

``````grouped_data = [... {u'timestamp': datetime.datetime(2016, 5, 25, 23, 55, 33), [{u'sensor': u'GP_PRES', u'value': u'96587'}, { u'sensor': u'GP_NO2', u'value': u'0.221'}, {u'sensor': u'GP_TC', u'value': u'14.600'}, {u'sensor': u'GP_HUM', u'value': u'83.94'}]}, ... ]
``````

I understood that I can use the groupby()-function from the itertools. I do not understand what kind of Key function I need in order to group by the equal 'timestamp'.

``````data2 = sorted(data, key= operator.attrgetter('timestamp'))
for k, g in groupby(data2, keyfunc):
groups.append(list(g))
``````

In your code, `operator.attrgetter('timestamp')` raises the error `AttributeError: 'dict' object has no attribute 'timestamp'`.

Replace it with `key=lambda d:d['timestamp']`, as shown below.

``````#!/usr/bin/env python
# -*- coding: utf-8 -*-

import itertools
import operator
import datetime

data = [{u'timestamp': datetime.datetime(2016, 5, 25, 23, 55, 33), u'sensor': u'GP_PRES', u'value': u'96587'},
{u'timestamp': datetime.datetime(2016, 5, 25, 23, 56, 33), u'sensor': u'GP_NO2', u'value': u'0.221'},
{u'timestamp': datetime.datetime(2016, 5, 25, 23, 56, 33), u'sensor': u'GP_TC', u'value': u'14.600'},
{u'timestamp': datetime.datetime(2016, 5, 25, 23, 56, 33), u'sensor': u'GP_HUM', u'value': u'83.94'}]

groups = list()
data2 = sorted(data, key=lambda d:d['timestamp'])
for k, g in itertools.groupby(data2, key=lambda d:d['timestamp']):
groups.append(list(g))

print(groups)   # 2 groups
[[  {u'timestamp': datetime.datetime(2016, 5, 25, 23, 55, 33), u'sensor': u'GP_PRES', u'value': u'96587'}],
[  {u'timestamp': datetime.datetime(2016, 5, 25, 23, 56, 33), u'sensor': u'GP_NO2', u'value': u'0.221'},
{u'timestamp': datetime.datetime(2016, 5, 25, 23, 56, 33), u'sensor': u'GP_TC', u'value': u'14.600'},
{u'timestamp': datetime.datetime(2016, 5, 25, 23, 56, 33), u'sensor': u'GP_HUM', u'value': u'83.94'}]]
``````
Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download