fukiburi - 2 months ago 11

Python Question

I'd like to concatenate two data frames, created from two lists:

`import pandas as pd`

import numpy as np

header_1 = ['A', 'B', -1, 3, 5, 7]

data_1 = ['X', 'Y', 1, 2, 3, 4]

d = pd.DataFrame(np.array([data_1]), columns=header_1)

header_2 = ['A', 'B', -2, 4, 5, 6]

data_2 = ['X', 'Z', 1, 2, 3, 4]

e = pd.DataFrame(np.array([data_2]), columns=header_2)

f = pd.concat([d, e])

> f

A B -1 3 5 7 -2 4 6

0 X Y 1 2 3 4 NaN NaN NaN

0 X Z NaN NaN 3 NaN 1 2 4

However, I want my numerical columns to appear in sorted order and was wondering if there an easier way than splitting off the first two columns, sorting the remaining dataframe and concatenating the two again:

`ab_cols = f[['A', 'B']] # Copy of first two columns`

g = f.drop(['A', 'B'], axis=1) # Removing cols from dataframe

h = g.sort_index(axis=1) # Sort remaining by column header

i = pd.concat([ab_cols, h], axis=1) # Putting everything together again

> i

A B -2 -1 3 4 5 6 7

0 X Y NaN 1 2 NaN 3 NaN 4

0 X Z 1 NaN NaN 2 3 4 NaN

I've thought about multi-indices, but I'm already using the index for something else (source of data row, not shown here), and I'm afraid a three-level multi-index might make it more complicated for slicing the dataframe later.

Answer

*Steps:*

Make columns as a series representation with both index and values equal to the index keys.

Using `pd.to_numeric`

with `errors=coerce`

, to properly parse numeric values and handling string values as `Nans`

.

Sort these values with pushing `Nans`

(which were string values before) on top as and when they are encountered.

Taking their corresponding indices and re-arranging the `DF`

based on these newly returned column labels.

```
c = pd.to_numeric(f.columns.to_series(), errors='coerce').sort_values(na_position='first')
f[c.index]
```