meto meto - 10 months ago 82
Python Question

Pandas: creating a Series from a tuple generator

Is there a way to create a

from a tuple generator?
My code looks like the following, but I'm sure there is a better way:

import numpy as np
import pandas as pd
g = ((n, s) for n, s in [("A", 1), ("B", 2), ("C", 3), ("D", 4), ("E", 5)])
arr = np.array(list(g))
ind, val = arr[:, 0], arr[:, 1]

pd.Series(val, index=ind)

Answer Source

Here's an alternative using the DataFrame constructor:

>>> g = ((n, s) for n, s in [("A", 1), ("B", 2), ("C", 3), ("D", 4), ("E", 5)])
>>> pd.DataFrame(g).set_index(0)[1]
A    1
B    2
C    3
D    4
E    5
Name: 1, dtype: int64

After the DataFrame is constructed, we set the index column and return a Series by selecting column 1.

This avoids the need for any temporary lists so might be more efficient (I haven't tested it yet). It also uses appropriate dtypes for each of the columns (int64 in this case) so it avoids object arrays being created first.