ludofet - 2 years ago 147

Python Question

I'm looking at creating a Dataframe that is the combination of two unrelated series.

If we take two dataframes:

`A = ['a','b','c']`

B = [1,2,3,4]

dfA = pd.DataFrame(A)

dfB = pd.DataFrame(B)

I'm looking for this output:

`A B`

0 a 1

1 a 2

2 a 3

3 a 4

4 b 1

5 b 2

6 b 3

7 b 4

8 c 1

9 c 2

10 c 3

11 c 4

One way could be to have loops on the lists direclty and create the DataFrame but there must be a better way. I'm sure I'm missing something from the pandas documentation.

`result = []`

for i in A:

for j in B:

result.append([i,j])

result_DF = pd.DataFrame(result,columns=['A','B'])

Ultimately I'm looking at combining months and UUID, I have something working but it takes ages to compute and relies too much on the index. A generic solution would clearly be better:

`from datetime import datetime`

start = datetime(year=2016,month=1,day=1)

end = datetime(year=2016,month=4,day=1)

months = pd.DatetimeIndex(start=start,end=end,freq="MS")

benefit = pd.DataFrame(index=months)

A = ['a','b','c']

dfA = pd.DataFrame(A)

result = pd.DataFrame(columns=['A','month'])

for i in dfA.index:

newdf = pd.DataFrame(index=benefit.index)

newdf['A'] = dfA.iloc[i,0]

newdf['month'] = newdf.index

result = pd.concat([result,newdf])

result

Recommended for you: Get network issues from **WhatsUp Gold**. **Not end users.**

Answer Source

One liner approach

```
pd.DataFrame(0, A, B).stack().index.to_series().apply(pd.Series).reset_index(drop=True)
```

Or:

```
pd.MultiIndex.from_product([A, B]).to_series().apply(pd.Series).reset_index(drop=True)
```

From dataframes, assuming the information is in the first column.

```
pd.MultiIndex.from_product([dfA.iloc[:, 0], dfB.iloc[:, 0]]).to_series().apply(pd.Series).reset_index(drop=True)
```

Functionalized:

```
def cross(df1, df2):
s1 = df1.iloc[:, 0]
s2 = df2.iloc[:, 0]
midx = pd.MultiIndex.from_product([s1, s2])
df = midx.to_series().apply(pd.Series).reset_index(drop=True)
df.columns = [s1.name, s2.name if s1.name != s2.name else 1]
return df
print cross(dfA, dfB)
0 1
0 a 1
1 a 2
2 a 3
3 a 4
4 b 1
5 b 2
6 b 3
7 b 4
8 c 1
9 c 2
10 c 3
11 c 4
```

Recommended from our users: **Dynamic Network Monitoring from WhatsUp Gold from IPSwitch**. ** Free Download**