dgikmo dgikmo - 1 year ago 156
Python Question

Split and merge pandas dataframe

I am trying to split and merge a Pandas dataframe.

The columns of the original data frame are arranged like so:

dataTime Record1Field1 ... Record1FieldN Record2Field1 ... Record1FieldN
time1 << record 1 data >> << record 2 data >>

I would like to take split the
fields into a separate data frame
, indexed by the dataTime.
will therefore look something like this:

dataTime Record2Field1 ... Record2FieldN
time1 << record 2 data >>

is populated, delete the Record2 columns from the original data frame. The first difficulty I'm having is in creating this
which contains the record 2 data.

Then, I would like to rename the columns in
so that they align with the
columns in the original data frame. (This portion I know how to do)

Finally I would like to merge
back into the original data frame.

The end result should look something like this:

dataTime Record1Field1 ... Record1FieldN
time1 <<record 1 data>>
time1 <<record 2 data>>

So far I haven't determined a good method of doing this. Any help is appreciated! Thanks.

Answer Source

You could get all your Record2 values under the Record1 columns as follows:

Data Setup:

data = StringIO(
dataTime Record1Field1 Record1Field2 Record1Field3 Record2Field1 Record2Field2 Record2Field3
01-01-2015 1 2 3 4 5 6 

df = pd.read_csv(data, delim_whitespace=True, parse_dates=['dataTime'])
print (df)

    dataTime  Record1Field1  Record1Field2  Record1Field3  Record2Field1  \
0 2015-01-01              1              2              3              4   

   Record2Field2  Record2Field3  
0              5              6 


df.set_index('dataTime', inplace=True)

# Filter column names corresponding to Record2
tempdf = df[[col for col in list(df) if col.startswith('Record2')]]

# Drop those columns after assigning to tempdf
df.drop(tempdf.columns, inplace=True, axis=1)

# Rename the column names for appending
tempdf.columns = [col for col in list(df) if col.startswith('Record1')]

# Concatenate row-wise
print (df.append(tempdf))

            Record1Field1  Record1Field2  Record1Field3
2015-01-01              1              2              3
2015-01-01              4              5              6
Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download