splinter splinter - 1 month ago 17
Python Question

Converting to long panel data format with pandas

I have a DataFrame where rows represent time and columns represent individuals. I want to turn it into into long panel data format in pandas in an efficient manner, as the DataFames are rather large. I would like to avoid looping. Here is an example: The following DataFrame:

id 1 2
date
20150520 3.0 4.0
20150521 5.0 6.0


should be transformed into:

date id value
20150520 1 3.0
20150520 2 4.0
20150520 1 5.0
20150520 2 6.0


Speed is what's really important to me, due to the data size. I prefer it over elegance if there is a tradeoff. Although I suspect I mam missing a rather simple function, pandas should be able to handle that. Any suggestions?

Answer

I think you need stack with reset_index:

print (df)
            1    2
date              
20150520  3.0  4.0
20150521  5.0  6.0

df = df.stack().reset_index()
df.columns = ['date','id','value']
print (df)
       date id  value
0  20150520  1    3.0
1  20150520  2    4.0
2  20150521  1    5.0
3  20150521  2    6.0

print (df)
id          1    2
date              
20150520  3.0  4.0
20150521  5.0  6.0

df = df.stack().reset_index(name='value')
print (df)
       date id  value
0  20150520  1    3.0
1  20150520  2    4.0
2  20150521  1    5.0
3  20150521  2    6.0